Tdpkg — различия между версиями

Материал из YourcmcWiki
Перейти к: навигация, поиск
м
м
Строка 1: Строка 1:
= tdpkg =
+
It’s surprising, but Debian package manager [http://en.wikipedia.org/wiki/Dpkg dpkg] stores its database in the form of a very big number of small files, located under <tt>/var/lib/dpkg/info</tt> directory, '''without any indexes'''! So, evidently it’s very slow in cold first runs, until all info files get into the filesystem cache.
  
The '''tdpkg''' shared library is used to speed up [http://en.wikipedia.org/wiki/Dpkg dpkg] *.list files loading using either [http://1978th.net/tokyocabinet/ tokyocabinet] or [http://www.sqlite.org sqlite3].
+
The '''tdpkg''' is an interesting solution to this problem — a wrapper library that’s used to speed up [http://en.wikipedia.org/wiki/Dpkg dpkg] *.list files loading using either [http://1978th.net/tokyocabinet/ tokyocabinet] or [http://www.sqlite.org sqlite3]. The larger your dpkg database is, the larger will be the speedup you notice.
  
 
It currently works for the [http://www.debian.org Debian GNU/Linux] system.
 
It currently works for the [http://www.debian.org Debian GNU/Linux] system.
Строка 8: Строка 8:
 
* Source code repository: http://gitorious.org/lethal-works/tdpkg
 
* Source code repository: http://gitorious.org/lethal-works/tdpkg
  
= build =
+
= Build =
  
 
'''Requirements''': build-essential, libsqlite3-dev ''(for the sqlite3 backend)'' and libtokyocabinet-dev ''(for the tokyocabinet backend)''
 
'''Requirements''': build-essential, libsqlite3-dev ''(for the sqlite3 backend)'' and libtokyocabinet-dev ''(for the tokyocabinet backend)''
Строка 16: Строка 16:
 
Type <code>make CACHE=sqlite</code> instead for sqlite3 support.
 
Type <code>make CACHE=sqlite</code> instead for sqlite3 support.
  
'''Important''': Library is experimental, it could make your system highly unstable.
+
= How it works =
  
= usage =
+
tdpkg is a simple wrapper library that intercepts file operations like open() and read() when put into LD_PRELOAD. Then it checks if the requested file is under <tt>/var/lib/dpkg/info</tt> and if its contents are already cached inside an indexed database (tokyocabinet or sqlite). If yes, then it returns data from the cache database instead of really reading it from the file.
  
Please make a backup copy of your /var/lib/dpkg/info/ directory before using tdpkg.
+
= Stability =
 +
 
 +
Library is experimental, there is no warranty of any sort on it!
 +
 
 +
In theory it could make your system, in particular, dpkg and apt-get, highly unstable.
 +
 
 +
My usage experience for more than a year, though, tells that tdpkg almost never fails, and when it fails, it doesn’t make any corruption to real dpkg database. All the negative impact is a temporary inability to install some packages while using tdpkg. Such problems are always fixed by running apt-get with tdpkg temporarily disabled.
 +
 
 +
= Usage =
 +
 
 +
Please make a backup copy of your <tt>/var/lib/dpkg/info/</tt> directory before starting to use tdpkg.
  
 
Manual usage:
 
Manual usage:
Строка 26: Строка 36:
 
<pre>LD_PRELOAD=./libtdpkg.so dpkg ... </pre>
 
<pre>LD_PRELOAD=./libtdpkg.so dpkg ... </pre>
  
For system-wide usage use an alias (make sure you use the absolute path to libtdpkg.so):
+
For system-wide usage put aliases into your <tt>~/.bashrc</tt> (make sure you use the absolute path to libtdpkg.so):
  
 
<pre>alias dpkg="LD_PRELOAD=/path/to/libtdpkg.so dpkg"
 
<pre>alias dpkg="LD_PRELOAD=/path/to/libtdpkg.so dpkg"
Строка 33: Строка 43:
 
The cache for both sqlite3 and tokyocabinet is located at '''/var/lib/dpkg/info/tdpkg.cache'''.
 
The cache for both sqlite3 and tokyocabinet is located at '''/var/lib/dpkg/info/tdpkg.cache'''.
  
= benchmarking =
+
= Benchmarking =
  
 
The operations involved with dpkg database reading are mostly done on the file system. For this reason cleaning up the kernel cache is a must before calling either tdpkg or dpkg:
 
The operations involved with dpkg database reading are mostly done on the file system. For this reason cleaning up the kernel cache is a must before calling either tdpkg or dpkg:
  
 
<pre>echo 1 > /proc/sys/vm/drop_caches</pre>
 
<pre>echo 1 > /proc/sys/vm/drop_caches</pre>

Версия 22:58, 4 октября 2012

It’s surprising, but Debian package manager dpkg stores its database in the form of a very big number of small files, located under /var/lib/dpkg/info directory, without any indexes! So, evidently it’s very slow in cold first runs, until all info files get into the filesystem cache.

The tdpkg is an interesting solution to this problem — a wrapper library that’s used to speed up dpkg *.list files loading using either tokyocabinet or sqlite3. The larger your dpkg database is, the larger will be the speedup you notice.

It currently works for the Debian GNU/Linux system.

Build

Requirements: build-essential, libsqlite3-dev (for the sqlite3 backend) and libtokyocabinet-dev (for the tokyocabinet backend)

Type make to build tdpkg with tokyocabinet support.

Type make CACHE=sqlite instead for sqlite3 support.

How it works

tdpkg is a simple wrapper library that intercepts file operations like open() and read() when put into LD_PRELOAD. Then it checks if the requested file is under /var/lib/dpkg/info and if its contents are already cached inside an indexed database (tokyocabinet or sqlite). If yes, then it returns data from the cache database instead of really reading it from the file.

Stability

Library is experimental, there is no warranty of any sort on it!

In theory it could make your system, in particular, dpkg and apt-get, highly unstable.

My usage experience for more than a year, though, tells that tdpkg almost never fails, and when it fails, it doesn’t make any corruption to real dpkg database. All the negative impact is a temporary inability to install some packages while using tdpkg. Such problems are always fixed by running apt-get with tdpkg temporarily disabled.

Usage

Please make a backup copy of your /var/lib/dpkg/info/ directory before starting to use tdpkg.

Manual usage:

LD_PRELOAD=./libtdpkg.so dpkg ... 

For system-wide usage put aliases into your ~/.bashrc (make sure you use the absolute path to libtdpkg.so):

alias dpkg="LD_PRELOAD=/path/to/libtdpkg.so dpkg"
alias apt-get="LD_PRELOAD=/path/to/libtdpkg.so apt-get" 

The cache for both sqlite3 and tokyocabinet is located at /var/lib/dpkg/info/tdpkg.cache.

Benchmarking

The operations involved with dpkg database reading are mostly done on the file system. For this reason cleaning up the kernel cache is a must before calling either tdpkg or dpkg:

echo 1 > /proc/sys/vm/drop_caches