Crawl PDF documents
-
Is it possible to crawl the content or metadata of PDF documents? We have a number of PDF documents that are linked in our documentation, that doesn't turn up in our search. Can SU directly crawl PDF content or metadata?
-
@lnelson_delphix if PDF documents have some meta data associated with them, SearchUnify can crawl those PDF documents. If you can share one document for reference, @PT can review and help you with this.
-
@lnelson_delphix said in Crawl PDF documents:
nts.
If the meta in PDF is correct SU will crawl the title from the Meta. Can
you please share one example URL?
-
The PDF documents linked to our documentation are only returned in the search results when we search for the title of the document. Not sure if we need to tune the results for specific keywords not in the title of the documents.
-
Hey @lnelson_delphix SU crawler is already doing pdf crawling on some of the content source. On which content source you would like to crawl pdf?
Suggested Topics
-
Cannot find specific documents by product version
Discussions • Created • Last Reply Last reply • jshenricks -
NEAR and NEXT operators
Discussions • Created • Last Reply Last reply • jshenricks -
Using the + operator for advanced searching
Discussions • Created • Last Reply Last reply • jshenricks -
Strip HTML from search results
Discussions • Created • Last Reply Last reply • jshenricks -
Perform advanced search using hashtag (#) without a space
Discussions • Created • Last Reply Last reply • jshenricks -
What to look forward in Mamba 21?
Discussions • Created • Last Reply Last reply • sugrokker -
Custom report on top search queries
Discussions • Created • Last Reply Last reply • jdesai -
iFrame communicates with multiple session ID
Discussions • Created • Last Reply Last reply • dinesh.diwakar