Tdpkg

Материал из YourcmcWiki
Перейти к: навигация, поиск

It’s surprising, but Debian package manager dpkg stores its database in the form of a very big number of small files, located under /var/lib/dpkg/info directory, without any indexes! So, evidently it’s very slow in cold first runs, until all info files get into the filesystem cache.

The tdpkg is an interesting solution to this problem — a wrapper library that’s used to speed up dpkg *.list files loading using either tokyocabinet or sqlite3. The larger your dpkg database is, the larger will be the speedup you notice.

It currently works for the Debian GNU/Linux system.

Build

Requirements: build-essential, libsqlite3-dev (for the sqlite3 backend) and libtokyocabinet-dev (for the tokyocabinet backend)

Type make to build tdpkg with tokyocabinet support.

Type make CACHE=sqlite instead for sqlite3 support.

How it works

tdpkg is a simple wrapper library that intercepts file operations like open() and read() when put into LD_PRELOAD. Then it checks if the requested file is under /var/lib/dpkg/info and if its contents are already cached inside an indexed database (tokyocabinet or sqlite). If yes, then it returns data from the cache database instead of really reading it from the file.

My personal opinion is that all this indexing and caching stuff should be really implemented inside dpkg, but the advantage of a wrapper library like tdpkg is that everyone can use it, instead of just people who patch their dpkg.

Stability

Library is experimental, there is no warranty of any sort on it!

In theory it could make your system, in particular, dpkg and apt-get, highly unstable.

My usage experience for more than a year, though, tells that tdpkg almost never fails, and when it fails, it doesn’t make any corruption to real dpkg database. All the negative impact is a temporary inability to install some packages while using tdpkg. Such problems are always fixed by running apt-get with tdpkg temporarily disabled.

Usage

Please make a backup copy of your /var/lib/dpkg/info/ directory before starting to use tdpkg.

Manual usage:

LD_PRELOAD=./libtdpkg.so dpkg ... 

For system-wide usage put aliases into your ~/.bashrc (make sure you use the absolute path to libtdpkg.so):

alias dpkg="LD_PRELOAD=/path/to/libtdpkg.so dpkg"
alias apt-get="LD_PRELOAD=/path/to/libtdpkg.so apt-get" 

The cache for both sqlite3 and tokyocabinet is located at /var/lib/dpkg/info/tdpkg.cache.

Benchmarking

The operations involved with dpkg database reading are mostly done on the file system. For this reason cleaning up the kernel cache is a must before calling either tdpkg or dpkg:

echo 1 > /proc/sys/vm/drop_caches