Archive for the 'Tech' Category

January 18th 2011

SpamFiles

I’ve been whining for a while about SpamFiles‘ speed on Windows. It creates and writes small amounts of data to hundreds of files, then deletes them all. It’s orders of magnitude slower on Windows (all the way to Seven) than on Linux, due to NTFS.

It’s just a synthetic benchmark though, right? That is, it’s reasonably irrelevant. Or so I thought.

In a recent private project I was using Spring’s JDBCTemplate with SQLite to write out a couple of hundred rows to an empty table. JDBCTemplate defaults to autocommit and it’s non-trivial to convince it not to do so.

The relevant code and sqlitelulz.jar shows why this is a problem:


>java -jar sqlitelulz.jar 1000
Autocommit: 70.867652636 seconds
Manual commit: 0.107324493 seconds

$ java -jar sqlitelulz.jar 1000
Autocommit: 1.814235004 seconds
Manual commit: 0.075502495 seconds

Yes, that’s 660 times slower on Windows (and only 25 times slower on non-ntfs). This time is entirely sqlite creating and deleting it’s journal file.

Sadness.

1 Comment »

December 14th 2010

git –set-commit-id

People often complain that git’s commit ids are too hard to remember and that they prefer the sequential ones generated by inferior version control systems.

Stock git doesn’t have an option to pick the commit id for a commit; this seems like a grave omission. I’ve prepared a patch which offers git commit –set-commit-id.

For example, everyone knows that the base commit in a repository should have a low number:

$ git init
Initialized empty Git repository in ./.git/
$ git add -A
$ git commit --set-commit-id 0000000 -a -m "Base."
Searching: 46% (12593/26843), done.
[master (root-commit) 0000000] Base.
1 files changed, 1 insertions(+), 0 deletions(-)
create mode 100644 myfile

If you’ve already messed up your repository, a handy fixing script is provided:
$ git lg
* fe5e2ee - (HEAD, master) work, work, work, it's all I do
* a2c1ec8 - work, work, work
* e580e5e - work, work
* a6ad5ee - work
* 0000000 - base
$ sequentialise.sh 0000000 6
Stopped at a6ad5ee... work
Searching: 39% (10468/26843), done.
[detached HEAD 0000010] work
1 files changed, 1 insertions(+), 0 deletions(-)
Stopped at e580e5e... work, work
Searching: 174% (46706/26843)
[...]
$ git lg
* 0000040 - (HEAD, master) work, work, work, it's all I do
* 0000030 - work, work, work
* 0000020 - work, work
* 0000010 - work
* 0000000 - base

Much more usable! This example repository is available for inspection. gitweb doesn’t show the commit ids on the log screen, but you can mouse-over and see them in the URLs.

Needless to say, this takes “a while”. sequentialise.sh defaults to 5 digits, i.e. enough for a million commits, and is reasonably fast on modern hardware. 6 digits is rather less tolerable.

1 Comment »

August 4th 2010

Java’s ZipFile performance

I have an application that scales well up to around five threads a core, due to the mix of IO and CPU that it does.

That is, you give it more threads, and the throughput increases; the overall time goes down.

The following graph shows, in blue, the Sun’s java.util.zip.ZipFile time to complete a set of unzips on an increasing number of threads:

Wait, what the cocking shit.

Continue Reading »

No Comments yet »

April 23rd 2010

Java/C++ polyglot

Today I discovered Java’s “inline C++” keyword, //\u000a/*, which makes a Java/C++ polyglot pretty easy:


//\u000a/*
#include <iostream>

#define private
#define public
#define static
#define void int
struct {
  std::ostream &println(const char *c) {
    return std::cout << c < < std::endl;
  }
} out;

//*/
/*\u002a/
import static java.lang.System.out;

public class Polyglot {
//*/
  public static void main(/*\u002a/String[] args//*/
      ) {
    out.println("Hello from whatever language this is!");
  }

/*\u002a/
}
// */

Eclipse deals.. okay. The red-underlining in the commented sections is for the spelling. <3

1 Comment »

April 18th 2010

InstallShield unpacker

I couldn’t find anything that would unpack the (entirely unnecessary) Nokia map loader set-up application, which is some InstallShield 7 nastiness.

deshield can. Given the number of magic numbers in it, I fully expect it not to work with other installers.

Why do they bother? The data isn’t even compressed; it’s just bit-twiddled a little with the file-name, and this magic number: [ 0x13, 0x35, 0x86, 0x07 ].

2 Comments »

April 8th 2010

Tro^WMicrobenchmarks!

This blog is far too low in trolling. As a start, everyone knows that git is fast and svn is slow, but I wasn’t aware quite how shocking the difference was.

The test: committing a file that slowly increases in size, and a new file, 200 times.

git: 2 seconds.
darcs: 10 seconds.
bzr: 70 seconds.
svn: 200 seconds.

No comment.

Reproduction steps follow.
Continue Reading »

3 Comments »

« Prev - Next »