Last update: 2009-12-23



Update on the development progress

There has been quite some interesting development progress lately which I want to share. Beside that there was again some stupid IMDb data I corrected. First to the enhancements and new features:

As I said above I stumbled again over some stupid IMDb entries. The most annoying entry could be found in the keywords where one was greater 100 characters in lenght. I fixed it already online in the database but it will be January when the list file is updated. There was also one very long entry in the color-info file but that did make sense. I'm going to change the import and table structure. Currently the color info tables contains an id and colorinfo column but there is also additional information stored in the file which is currently also stored in the colorinfo column. The next version will store this additional information in the new column additional as it's also done for some other data files.

Ok that's it for the moment. There will be some update shortly that's not only for the beta-testers. ;)


General update

There hasn't been much progress lately but moving/renovating is now complete. I also had some trouble with a water leak coming from the appartment above mine. The result wasn't looking very nice I can tell you. Whatever, I'm again working on JMDB and also some webOS (Palm Pre/Pixi) applications right now. I hope I'm still able to release a new JMDB version this year which would be nice for the 10-years anniversary of the Java Movie Database (Note: The public GUI application release was in 2001 but development started in late 1999 with some commandline application).


Update on the JMDB development progress

I've been again busy at work and also with some renovating which slowed me down on the JMDB development. Still I was able to make a few enhancements:

Beside that I had lot's of requests on JMDB lately. Some asked how to remove the TV episodes from the result while running a self-build SQL query and some asked again for the IMDb file parser code. That parser code is something I can't share because it's the fundamental part of JMDB and also strongly tied to the database/import code (the movie lookup class,...).
There have also been two requests for the IMDb movie ID file that is updated by a JMDB user an uploaded to me from time to time (latest file is from April 2009; I don't know how far the progress of IMDbs the article swapping has been in April!).

IMDb also had again incorrect and incomplete data in their list files. The number of errors JMDB writes into the IMDB error file is not as big as before but there are still some missing links to movies. There have been also some release-dates with no date but a '?' as date. I ask you: Don't you think it should be required to give an exact date for the 'release-dates'? If I don't know the date it's worth nothing to tell every user that something has been released in country but I have no clue when it was. I really think IMDb should stop adding stupid information like this.


Next JMDB releases (development progress)

Sorry I didn't update the website within the last 3 month. I've been busy on my new job and right now I'm also moving, still JMDB has been updated in the meantime.

Most time went into updating the import code where a user pointed me into the right direction (JDBC Batch Inserts). First I've updated only some methods so Batch Inserts where used and the speed improvement were good for PostgreSQL. MySQL at first wasn't any faster but that changed after I updated to an still unreleased Connector/J JDBC driver plus using the URL-Parameter "rewriteBatchedStatements=true". I learned that from a post of the Connector/J developer Mark Matthews on his blog (Mark Matthews: A 10x Performance Increase for Batch INSERTs With MySQL Connector/J Is On The Way....). With this parameter and the new daily build of the MySQL JDBC driver MySQL could again compete with PostgreSQL which was taking a lead performance wise. PostgreSQL still was a little bit faster anyway but that was only because of the faster index creation especially on the movies2actors table (round about 11 million entries for each of the two indexes). Round about two weeks ago the external InnoDB plugin v1.0.4 has been released for MySQL v5.1.37 (InnoDB Plugin Download) and I tested it as well. The blog entry of the InnoDB Plugin v1.0.4 announcement lists a few enhancements over the build-in InnoDB engine of MySQL. With the external InnoDB plugin now the index creation is a lot faster, so I can recommend this plugin if you want to speed up MySQL (I'll attach a PDF file later with some performance charts for PostgreSQL and MySQL). The plugin also offers file compression but I haven't tested that yet.

Well you might want to know how much faster the current development of JMDB really is? Well one of the beta tester (he's using MySQL) was able to import all IMDb list files in 29 min and 45 sec while it took 2h 55min and 53 sec with the previous version (comparable with JMDB v1.36). That's round about 5.9 times faster. I'm cleaning up the current development version within the next weeks and release this to the public. This way all of you don't have to wait any longer to the next release - I think the speed improvement achieved is worth it.

The IMDb data is still pretty messed up (after the import of the IMDb list files just look into the IMDb_Error.log JMDB creates). I'm going to hunt down someone at IMDb to get those problems solved. Maybe I should try to get in contact with Cole Needham directly.


They finally did of the problematic IMDb data errors is fixed

The IMDb error in the AKA-Names file is finally fixed (IMDb data from 2009-05-15) so you don't have to fix the file manually and JMDB v1.36 doesn't crash any more while importing the IMDb files. As the german-aka file isn't uploaded to the FTP-Server since last month or so (hasn't been updated for years), you need to disable it in the JMDB setup dialog as JMDB is importing the file by default. That's all for now. Once I have more to share I'll let you know.


Here again some updates on "Ups, they did it again..." (see 2009-04-06) and other stuff

As the IMDb errors I wrote here haven't been fixed I used the normal IMDb Helpdesk to report the error on the AKA-Names last weekend. This time a got a reply I'm going to share:

Re: AKA Name "Vanessa" includes a TAB char

Thanks for reporting the problem to us.

We are aware of this technical issue and our staff is looking into it.
We hope to have it fixed as soon as possible.
Sorry for any inconvenience caused and thanks once again for bringing this to our attention!

The IMDb Help Desk

I still have my doubts but we'll see if and when this changes.

Other updates on JMDB

I did a quick hack to integrate Growl for Windows. Below you see the result using the standard notifiction box and then again using the plain box (there are more). As I didn't send an icon together with the notification message it's using the default icon.

Growl for Windows notification sample 1
Screenshot of the "standard" Growl for Windows notification style (JMDB development version 2009-05-09)

Growl for Windows notification sample 2
Screenshot of the "plain" Growl for Windows notification style (JMDB development version 2009-05-09)


Update on "Ups, they did it again..." (see 2009-04-06)

Sad to say but the reported errors still haven't been fixed. Still there are fewer errors in the release-dates.list as only one entry at the top has no title but a release-date. The other errors are still there. Let's see if wonders happen and the problems are fixed next week. Normally my updates should have been processed if I look at the IMDB processing times page.

I already see it coming that I need to release my internal development version of JMDB v1.40 to the public so the broken IMDb data can be used without correcting the files manually. I haven't touched the current development version for the last couple of days as I have to work on something else right now. End of next week I should be able to resume the work again.


JMDB v1.40 progress and Update on "Ups, they did it again..." (see 2009-04-06)

There has been again much progress on JMDB development within the last two weeks. This includes support of two more list files. The JMDB v1.40 development version has been sent to the test users.

So there are now four more list files supported in JMDB v1.40 compared to the current v1.36. I'm pretty sure that I'll also add the soundtracks.list.

Ups, they did it again... Almost two weeks after I fixed the error online using the correction form, the AKA-Names list still contains this error. It's even better! One of the beta testers pointed out that the AKA-Names file from last week contains another error where a TAB character is involved. The good thing is that this doesn't break the import code as the other one does.

For JMDB v1.36 you still need to correct the error manually as it has been written below (News entry from 2009-04-06) or you can disable the import of the AKA-Names file. JMDB v1.40 contains a workaround for the errors but I can't release it yet to the public and I also can't apply the fix to the current release version. I really hope next week my fix of the data finally makes it into the list file export.

Other errors in the release-dates.list reported round about four weeks ago haven't also been fixed yet. As I wrote earlier these are worked around in JMDB v1.40. Generally all kind of IMDb data errors are written to the IMDb_Error.log that JMDB creates while importing new data.
Currently this file is 6.5 MB big (with IMDb list files from 2009-04-10) if you select all files to import. Round about 80% of the errors are from the locations.list which which will be supported in JMDB v1.40 and up. Most of the other errors (a little bit over 15%) come from the german-aka-titles/italian-aka-titles because those files are unsupported by IMDb for a few years now. If IMDb changes the title of a movie a little bit (movies.list) the outdated files are not updated. The result is that the title found in those aka-titles files are not found in the movies.list and each entry not found shows up in the IMDb_Error.log.


Ups, they did it again...IMDb released a broken AKA-Names list file which is crashing the JMDB import process

IMDb released a broken AKA-Names list file last weekend that is crashing the JMDB import. I already fixed the IMDb data using a web form but it will take the IMDb list keepers at least to next Saturday to fix the problem (maybe longer). A TAB has been added in the data where it doesn't belong and this is not the first time this happens. IMDb really needs more checks when data is stored in the database.
If you want to use the aka-names.list.gz released on 2009-04-03 you need to extract the file and fix the broken entry using a text editor.

Search for the entry "Vidal, Vanessa" (see full entry below):
Vidal, Vanessa
(aka       Vanessa)
(aka Kennedy, Mrs. V.)
(aka Videl, Vanessa)

The first aka entry contains a TAB between aka and Vanessa that has to be removed. After you fixed this issue save the file and start the JMDB file import. Now it doesn't crash while importing the AKA-Names.

The release-dates.list.gz also contains several errors (see news from 2009-03-29) which I worked around for the next major JMDB release v1.40 (still in early internal testing stage). The errors have been reported to IMDb round about two weeks ago but so far they haven't been fixed.

While I'm talking about JMDB v1.40, I can tell you that I enhanced the locations list a little bit. I created hyperlinks that open the webbrowser with the location details using Google Maps. Most of time you'll get a map showing the location used to film the movie. Google Maps has some problems if the location contains a name of a building and doesn't find the location but in general it's working quite nicely.


First Alpha of JMDB v1.40 sent to external testers!

I've been working hard on the next JMDB version. It doesn't have a release date as I'm going to get a new job shortly and I can't plan anything right now.

So far the feedback on the new version was very good as no problems have been found. There are still some things in the works but here are some functions already working:

Some other things which should improve the import speed are also in the works. The query speed should be increased by using multiple connections (connection pooling).


The webserver hosting the JMDB packages will be moved shortly!

The provider hosting my personal website (including other projects) is moving the complete server farm to the new computing center. In the night from 18th February to 19th February 2009 (CET) my server will be affected by this task.

As the JMDB packages are stored on this personal website there will be problems downloading the latest JMDB releases until they have completed that task. Please excuse this inconvenience. Many thanks in advance!


JMDB v1.36 refresh is available at the download section!

The refreshed version of the JMDB v1.36 is now available. The only things changed between the first release and this one are two more language files for Chinese mainland and Taiwan.
You don't have to update if you don't need the Chinese language files.


JMDB v1.36 is available at the download section!

The new JMDB v1.36 is now available. You should upgrade as soon as possible. As before there are installer packages for eComStation (OS/2), Windows and a simple ZIP archive for most other operating systems available. JMDB can be used on every platform where a Java runtime (Java >=1.2) is available. Some smaller enhancements are also available for Mac OS X users.


JMDB v1.36 will be available shortly!

There has been a problem with the aka-titles.list(.gz) for some time now that the upcoming JMDB release v1.36 will fix. The startup scripts will also be updated. I've already created the JAR file of the new version but I have to update the documentation and create the installation packages for each of the operating systems.
I also still work on JMDB v1.40 which will follow next. That release will include the database changes I already talked about.

As you might have noticed I also updated the IMDb data development picture at the entry page. The update of the startup scripts for JMDB v1.36 is also because the limit for the import with 300 MB memory has been reached again. If you haven't updated the -Xmx parameter yet you should have seen the OutOfMemoryException warning. I got it when I tried to import the IMDb data from 2009-01-30 with Java 1.6.0_11-b03 on Vista (32Bit) and JMDB was processing the biographies.list(.gz) file.

Finally the news section has been extended to reflect we now have 2009. I think this will be a great year for JMDB. There might even be a JMDB-Mobile version with a subset of IMDb data. More on that when I released JMDB v1.40.


We need help!

We're looking for volunteers creating new language files.

If you want to help, contact us!

» Contact...


If you want to support the development you can » donate!