How to get the short intro from an attachment (vpp file)

Vote:
 

Hi,

I would like to fetch the first lets say 150 chars from a file hit (Word, PDF).

If I was searching by using a searchword, then I have the x.SearchAttachment().AsHighlighted(...), but if no search query was used, then I would just like to get the introduction of the file instead...

(and I do not mean the SearchSummary())

Is this possible?

#61809
Oct 02, 2012 13:45
Vote:
 

Hi Fredrik,

Later versions of the .NET API has an AsCropped method for Attachments (it always has for strings), so you can do x.SearchAttachment().AsCropped. See an example (although for a string field) in my reply here.

#61810
Oct 02, 2012 14:06
Vote:
 

Yes nu finns den där AsCropped() som gör utrdag utan att man har ett sökord.

Dock, för PDF filer så blir det inget roligt utdrag, verkar mest bli tomt eller siffran "0". 

Är det jag som tänkt fel eller är det en bugg möjligtvis?

Däremot, när man har ett sökord, så funkar AsHighlighted() fint.

#63026
Nov 07, 2012 13:47
Vote:
 

bump

No search phrase provided

Excerpt = x.SearchAttachment().AsCropped(2000) (the PDF contains text and images)

gives me nothing, or just a "0"

Is this a bug or am I doing it the wrong way?

 

 

#63326
Nov 15, 2012 10:11
Vote:
 
#63327
Nov 15, 2012 10:11
Vote:
 

Hi,

It seems to be working for me with pdfs. All the tests we have for cropping pdfs works as well. Is it a specific pdf that causes the problem or is it all pdfs that you have tried? 

PDF indexing is hard and some pdfs have really weird indentations when extracting the content. Are you sure there is always 0 or could it be whitespaces?

#63372
Nov 16, 2012 9:57
Vote:
 

Could be whitespaces, but shouldn't the AsCropped method take care of that?

Remember that this is only happening when not providing a search term

#64420
Dec 20, 2012 11:22
Vote:
 

For attachments where the "cropped" text contains new lines the AsCropped function might fail. We will fix this issue in the backend (it won't require any action by the affected users).

/Henrik

#64773
Jan 09, 2013 16:11