Home > Databases, Programming > SHA1 Speed Differences [PHP vs MySQL] – Part 1

SHA1 Speed Differences [PHP vs MySQL] – Part 1

Reading my favorite blog, I’ve encountered a rather interesting article discussing the speed difference between PHP and MySQL in SHA1 algorithms. The results obtained were suspicious to me. Usually in database terms, reads are faster then inserts, but in this case it was the opposite.

Being the hard headed I am, I had to find out what was wrong with the tests conducted in that article. The poster kindly provided the source which was really dandy as I could execute the same tests as the poster did. After several head bangs trying to setup PHP to work flawless on my pc, I managed to analyse the tests.

At first glance, everything looked really well done and flawless. Tests were godly written and everything was up to standard. After considerable time contemplating and a beer, I came to the conclusion that the problem lies elsewhere. The tests were simple enough that it was very easy to understand and even test. One query had a fixed string and the other one had an SHA1 function call from MySQL, exactly as the test intended.

The thought that was haunting me was that even the normal PHP method, (ie. the string comparison) reading was slower than the insert. Looking at this problem alone and excluding PHP vs MySQL performance led me to the problems. Yes that’s right, there is more than one problem. It’s not how the tests are written but it’s more how databases work internally which should be taken into consideration once working with databases.

Database Search Explained

Before explaining the problem, let’s discuss a bit on how databases normally work and how they fetch data. Let’s take the tests provided by the poster and use his table as our example reference.

CREATE TABLE test_table (
    identity INTEGER AUTO_INCREMENT,
    digest VARCHAR(41),
    PRIMARY KEY (identity)
);

This is the table structure and as you can see, it is simple and perfectly normal. It has two fields, a numeric primary key and a string field of length 41 to store our hashed data.

When the database is asked to retrieve information which have ‘digest’ equals to ‘hello’, the database has to check each row if the text in ‘digest’ field matches ‘hello’. If there are 500 records, the database will check 500 records one by one and at the end returns the records which have digest = ‘hello’. As explained before, this is how normally work and not how they should work. This method of search can become very slow on large tables as checking each and every row isn’t a very intelligent way to search. Telephone directories have indices and each page is sorted by the surname to ease your searching. Imagine going through your telephone directory searching for the surname of ‘Zohan John” which should be located in the last few pages. That would take ages and this also applies for databases. This also applies for databases which make use of indices and sorting to retrieve that information quicker.

With an index on the ‘digest’ field one can locate information much faster instead of checking each field one by one. What happens when a database searches for information on an indexed field (the whole field is indexed and not the first letter or two)  is as follows. It will ask the index if there’s anything that matches the whole word ‘hello’. The index will reply with either a ‘match’ or ‘no match’. If a match is returned, the location will be identified and the database can go and retrieve the information immediately without waisting any more valuable time asking each and every record if it matches. The result is usually instant which is what raised my interest in the article.

Index Search vs Normal Search

After all the technicalities, it is time to see if what is said above is the culprit. The test is inserting 500 records into the table and querying the database 500 times one for each record which returns with a match. Without an index, the database has to go through all records and thus the ‘select’ results are slower then the ‘insert’ commands. Second beer in hand, I placed the index and the results produced were much better. The new table structure looks like this.

CREATE TABLE test_table (
    identity INTEGER AUTO_INCREMENT,
    digest VARCHAR(41),
    PRIMARY KEY (identity),
    KEY (digest)
);

This solved the first problem which was haunting me like my worst nightmare. Is ‘select’ slower than ‘insert’?

Results – Select vs Insert

My average results vary from the original article but the difference can still be identified.

Records Normal Select Index Select Insert
10 1.9342ms 1.7923ms 2.1394ms
500 2.5648ms 1.7885ms 2.2872ms
1000 3.1148ms 1.7948ms 2.0497ms
5000 3.6964ms 1.7961ms 2.2519ms

This shows that the indexed search really does a great job when dealing with a lot of information. They’re a necessity for a database to work soundly. I am not implying one should go overboard with indices but one should know when and how to use them.

With this said, there’s still the performance issue between PHP vs MySQL. Could it be MySQL performs a function which basically does the same that much slower, 4 times slower? Be patient and I’ll post my finding in another post as this is already too long to read.

Advertisement
  1. September 24, 2009 at 10:08 pm | #1

    Great article, and a good reminder of why one should use indices :)

    Thanks for covering this – it hadn’t occurred to me that the select *should* be faster. Looking forward to reading your further findings.

    Thanks for the comments about the blog too – much appreciated! :$

  2. September 24, 2009 at 10:30 pm | #2

    Thanks Karl, tried to contact you today but gchat was down. :] Love your blog, always fresh with new ideas. :] I’ll try to finish the article by tomorrow.

  3. September 25, 2009 at 1:16 am | #3

    Interesting post. I can’t wait for part 2!

  1. September 26, 2009 at 12:37 am | #1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.