Database size
Moderators: XnTriq, helmut, xnview
Database size
Glad to see you're using sqlite3 as I value this software pretty high.
But the database gets pretty huge.
I got a database of about 50MB for just 2k of files.
AVG(LEN(pv)) is 18831 (bytes) in that database?
18kb+ per thumbnail (even if there is more meta-data hidden in pv) seems unreasonable high...
Indexing my 250k pictures would then give me 4GB+ blob data :p
A test thumb (db against photoshop'ed thumb) made:
db blob size of pv: ~ 18kb
db blob gzip'ed (per 7z/max): ~ 13kb
bmp32: ~ 15kb
bmp32 + gzip: ~ 10kb
png24 + alpha: ~ 7kb
png8: 3kb
jpeg80: ~ 3kb
jpeg30: ~ 1kb
PS: it's easy to implement field-based encoding/compression by using sqlite3_create_function()...
Simply register the encoder/decoder when you init your |sqlite3*| and then modify the queries to use those
But the database gets pretty huge.
I got a database of about 50MB for just 2k of files.
AVG(LEN(pv)) is 18831 (bytes) in that database?
18kb+ per thumbnail (even if there is more meta-data hidden in pv) seems unreasonable high...
Indexing my 250k pictures would then give me 4GB+ blob data :p
A test thumb (db against photoshop'ed thumb) made:
db blob size of pv: ~ 18kb
db blob gzip'ed (per 7z/max): ~ 13kb
bmp32: ~ 15kb
bmp32 + gzip: ~ 10kb
png24 + alpha: ~ 7kb
png8: 3kb
jpeg80: ~ 3kb
jpeg30: ~ 1kb
PS: it's easy to implement field-based encoding/compression by using sqlite3_create_function()...
Simply register the encoder/decoder when you init your |sqlite3*| and then modify the queries to use those
Re: Database size
But which compression do you use for cache db?? no, zip or jpeg?MaierMan wrote:Glad to see you're using sqlite3 as I value this software pretty high.
But the database gets pretty huge.
I got a database of about 50MB for just 2k of files.
AVG(LEN(pv)) is 18831 (bytes) in that database?
18kb+ per thumbnail (even if there is more meta-data hidden in pv) seems unreasonable high...
Indexing my 250k pictures would then give me 4GB+ blob data :p
A test thumb (db against photoshop'ed thumb) made:
db blob size of pv: ~ 18kb
db blob gzip'ed (per 7z/max): ~ 13kb
bmp32: ~ 15kb
bmp32 + gzip: ~ 10kb
png24 + alpha: ~ 7kb
png8: 3kb
jpeg80: ~ 3kb
jpeg30: ~ 1kb
PS: it's easy to implement field-based encoding/compression by using sqlite3_create_function()...
Simply register the encoder/decoder when you init your |sqlite3*| and then modify the queries to use those
Pierre.
Re: Database size
If that blob just contains the thumbnail (as an image) I'd make jpegs (or pngs) our of it as this is probably the best image compression.xnview wrote:But which compression do you use for cache db?? no, zip or jpeg?
If it is mixed data I would at least (g)zip it.
Or even better store the thumbnail part in jpeg and the other stuff from the mix in another field using gzip (or no) compression.
Re: Database size
No, you don't understand me Which compression method do you use in option/Cache?MaierMan wrote:If that blob just contains the thumbnail (as an image) I'd make jpegs (or pngs) our of it as this is probably the best image compression.xnview wrote:But which compression do you use for cache db?? no, zip or jpeg?
If it is mixed data I would at least (g)zip it.
Or even better store the thumbnail part in jpeg and the other stuff from the mix in another field using gzip (or no) compression.
Pierre.
Re: Database size
Default one I assume...xnview wrote:No, you don't understand me Which compression method do you use in option/Cache?
Letmesee...
hmm... Seems it was ZIP.
Some new tests on |AVG(LENGTH(pv))| using a fileset of 1117 files/690MB and leaving everything else (thumnail dimensions) default:
None: 13995.95
ZIP: 12889.05 (should be more like 7-10k)
High JPEG: 7318.35 (should be more like 2-4k)
Lossy JPEG: 7027.64 (should be more like 1-2k)
Some pv fields seem to be huge. up to 64kb in my tests.
May it be that you're storing meta-data (EXIF, IPTC and such things) within the thumbs?
I uploaded some of those dump files to http://celebnamer.celebworld.ws/stuff/xnview/thumbdump/
The first 7 of them (all 40+kb) have EXIF, IPTC and XMP meta-data (Origin files).
The rest (all below 18kb) have no meta-data.
Another observation:
Dumping the pv data into files, 1117 in my case, and applying gzip allows the shrink the size a lot.
I therefore assume that there is a lot of additional compression possible. (and gzip --fast is actually fast )
-- High JPEG
raw (from db): 11416kb
find dump/ | xargs gzip -c9 > dump.bin: 5492kb
find dump/ | xargs gzip -c1 > dump.bin: 5617kb
-- ZIP
raw (from db): 19930kb
find dump/ | xargs gzip -c9 > dump.bin: 13672kb
find dump/ | xargs gzip -c1 > dump.bin: 13830kb
Re: Database size
Yes, i store all metadata, but with old cache system do you have almost same size for db?MaierMan wrote:Default one I assume...xnview wrote:No, you don't understand me Which compression method do you use in option/Cache?
Letmesee...
hmm... Seems it was ZIP.
Some new tests on |AVG(LENGTH(pv))| using a fileset of 1117 files/690MB and leaving everything else (thumnail dimensions) default:
None: 13995.95
ZIP: 12889.05 (should be more like 7-10k)
High JPEG: 7318.35 (should be more like 2-4k)
Lossy JPEG: 7027.64 (should be more like 1-2k)
Perhaps you can send me 1 of your first items?
Currently i compress with zlib metadata too...
Pierre.
Re: Database size
Old cache gives about 70-80% the size for those 1117 files. All modes.xnview wrote: Yes, i store all metadata, but with old cache system do you have almost same size for db?
But that doesn't really matter. New cache system gives the opportunity to improve it.
Send you what exactly?xnview wrote:Perhaps you can send me 1 of your first items?
Some pv field dumps are available from http://celebnamer.celebworld.ws/stuff/xnview/thumbdump/
I added some using "Low JPEG"... that folder also contains an "find | xargs cat" assembled dump.bin and a "find | xargs gzip -c1" assembled dump.bin.gz.
As you can see from comparing these both there is still a lot of compression possible, even with gzip in "--fast" (or "wb1") mode.
Is it really necessary to store it as well?xnview wrote:Currently i compress with zlib metadata too...
Wouldn't a bitfiled indicating "5 = HAS_EXIF | HAS_XMP" be enough?
Metadata adds up to 40-50kb for a good tagged file (raw).
The compression seems to be "faulty" as I can easily compress these even more (see remarks above).
Looking at those thumbs that belong to images without meta data the thumb size seems reasonable, at least in "Low JPEG" mode.
This is something I didn't really realize till now as most of my files contain the full range of meta-data (COM, ITPC, EXIF, XMP).
So my current conclusion is: meta-data is not compressed well enough, or it shouldn't be fully stored at all.
Re: Database size
The picture fileMaierMan wrote:Send you what exactly?xnview wrote:Perhaps you can send me 1 of your first items?
Yes, to be able to show IPTC/EXIF in labels (thumbnails view)MaierMan wrote:Is it really necessary to store it as well?xnview wrote:Currently i compress with zlib metadata too...
Wouldn't a bitfiled indicating "5 = HAS_EXIF | HAS_XMP" be enough?
Metadata adds up to 40-50kb for a good tagged file (raw).
Pierre.
Re: Database size
Uploaded the files corresponding to those thumbs to:xnview wrote:The picture fileMaierMan wrote:Send you what exactly?xnview wrote:Perhaps you can send me 1 of your first items?
http://celebnamer.celebworld.ws/stuff/x ... mp/Origin/
If it's just to display those pictograms/labels then it would be sufficient to store a bit indicating "EXIF-there/not-there" and nothing more.xnview wrote:Yes, to be able to show IPTC/EXIF in labels (thumbnails view)MaierMan wrote:Is it really necessary to store it as well?xnview wrote:Currently i compress with zlib metadata too...
Wouldn't a bitfiled indicating "5 = HAS_EXIF | HAS_XMP" be enough?
Metadata adds up to 40-50kb for a good tagged file (raw).
Those other tools (Preview, Properties) seem to load the files again anyway. At least that's what FileMon tells me...
Re: Database size
No, in thumbnail labels, i use cache, i don't load the file.If it's just to display those pictograms/labels then it would be sufficient to store a bit indicating "EXIF-there/not-there" and nothing more.MaierMan wrote:Yes, to be able to show IPTC/EXIF in labels (thumbnails view)
Those other tools (Preview, Properties) seem to load the files again anyway. At least that's what FileMon tells me...
Pierre.
Re: Database size
Ok, i have removed all not needed data, and now db is 50% lesser.MaierMan wrote:Uploaded the files corresponding to those thumbs to:xnview wrote:The picture fileMaierMan wrote: Send you what exactly?
http://celebnamer.celebworld.ws/stuff/x ... mp/Origin/
Could you send me an email, i would like to send you an alpha version? And perhaps ask you some questions about sqlite
Pierre.