mysql - type - MyISAM與InnoDB




mysql table type (18)

InnoDB提供:

ACID transactions
row-level locking
foreign key constraints
automatic crash recovery
table compression (read/write)
spatial data types (no spatial indexes)

在InnoDB中,除TEXT和BLOB以外的所有行中的數據最多可以佔用8,000個字節。 InnoDB沒有全文索引。 在InnoDB中,COUNT(*)s(當不使用WHERE,GROUP BY或JOIN時)的執行速度比MyISAM慢,因為行數不是內部存儲的。 InnoDB將數據和索引存儲在一個文件中。 InnoDB使用緩衝池來緩存數據和索引。

MyISAM提供:

fast COUNT(*)s (when WHERE, GROUP BY, or JOIN is not used)
full text indexing
smaller disk footprint
very high table compression (read only)
spatial data types and indexes (R-tree)

MyISAM具有表級鎖定,但沒有行級鎖定。 沒有交易。 沒有自動崩潰恢復,但它確實提供修復表功能。 沒有外鍵約束。 與InnoDB表相比,MyISAM表通常在磁盤上更緊湊。 如果需要,可以通過使用myisampack進行壓縮,從而使MyISAM表進一步高度縮小,但是只能讀取。 MyISAM將索引存儲在一個文件中,將數據存儲在另一個文件中 MyISAM使用密鑰緩衝區來緩存索引,並將數據緩存管理留給操作系統。

總的來說,我會推薦InnoDB用於大多數用途,而MyISAM僅用於專業用途。 InnoDB現在是新版MySQL中的默認引擎。

我正在研究一個涉及大量數據庫寫入的項目,我會說( 70%的插入和30%的讀取 )。 這個比例還包括我認為是一次讀取和一次寫入的更新。 讀取可能很髒(例如,在閱讀時我不需要100%準確的信息)。
有關任務將每小時完成超過100萬次數據庫事務。

我已經閱讀了網上關於MyISAM和InnoDB之間差異的一些內容,MyISAM似乎是我將用於此任務的特定數據庫/表的明顯選擇。 從我似乎正在閱讀的內容來看,如果支持行級鎖定,需要事務處理,InnoDB就很好。

有人對此類負載(或更高)有任何經驗嗎? MyISAM是否要走?


The Question and most of the Answers are out of date .

Yes, it is an old wives' tale that MyISAM is faster than InnoDB. notice the Question's date: 2008; it is now almost a decade later. InnoDB has made significant performance strides since then.

The dramatic graph was for the one case where MyISAM wins: COUNT(*) without a WHERE clause. But is that really what you spend your time doing?

If you run concurrency test, InnoDB is very likely to win, even against MEMORY .

If you do any writes while benchmarking SELECTs , MyISAM and MEMORY are likely to lose because of table-level locking.

In fact, Oracle is so sure that InnoDB is better that they removed MyISAM from 8.0!

The Question was written early in the days of 5.1. Since then, these major versions were marked "General Availability":

  • 2010: 5.5 (.8 in Dec.)
  • 2013: 5.6 (.10 in Feb.)
  • 2015: 5.7 (.9 in Oct.)
  • [TBD, maybe 2018], 8.0

Bottom line: Don't use MyISAM


MyISAM

The MyISAM engine is the default engine in most MySQL installations and is a derivative of the original ISAM engine type supported in the early versions of the MySQL system. The engine provides the best combination of performance and functionality, although it lacks transaction capabilities (use the InnoDB or BDB engines) and uses table-level locking .

FlashMAX and FlashMAX Connect: Leading the Flash Platform Transformation Download Now Unless you need transactions, there are few databases and applications that cannot effectively be stored using the MyISAM engine. However, very high-performance applications where there are large numbers of data inserts/updates compared to the number of reads can cause performance proboelsm for the MyISAM engine. It was originally designed with the idea that more than 90% of the database access to a MyISAM table would be reads, rather than writes.

With table-level locking, a database with a high number of row inserts or updates becomes a performance bottleneck as the table is locked while data is added. Luckily this limitation also works well within the restrictions of a non-transaction database.

MyISAM Summary

Name -MyISAM

Introduced -v3.23

Default install -Yes

Data limitations -None

Index limitations -64 indexes per table (32 pre 4.1.2); Max 16 columns per index

Transaction support -No

Locking level -Table

InnoDB

The InnoDB Engine is provided by Innobase Oy and supports all of the database functionality (and more) of MyISAM engine and also adds full transaction capabilities (with full ACID (Atomicity, Consistency, Isolation, and Durability) compliance) and row level locking of data.

The key to the InnoDB system is a database, caching and indexing structure where both indexes and data are cached in memory as well as being stored on disk. This enables very fast recovery, and works even on very large data sets. By supporting row level locking, you can add data to an InnoDB table without the engine locking the table with each insert and this speeds up both the recovery and storage of information in the database.

As with MyISAM , there are few data types that cannot effectively be stored in an InnoDB database. In fact, there are no significant reasons why you shouldn't always use an InnoDB database. The management overhead for InnoDB is slightly more onerous, and getting the optimization right for the sizes of in-memory and on disk caches and database files can be complex at first. However, it also means that you get more flexibility over these values and once set, the performance benefits can easily outweigh the initial time spent. Alternatively, you can let MySQL manage this automatically for you.

If you are willing (and able) to configure the InnoDB settings for your server, then I would recommend that you spend the time to optimize your server configuration and then use the InnoDB engine as the default.

InnoDB Summary

Name -InnoDB

Introduced -v3.23 (source only), v4.0 (source and binary)

Default install -No

Data limitations -None

Index limitations -None

Transaction support -Yes (ACID compliant)

Locking level -Row


Please note that my formal education and experience is with Oracle, while my work with MySQL has been entirely personal and on my own time, so if I say things that are true for Oracle but are not true for MySQL, I apologize. While the two systems share a lot, the relational theory/algebra is the same, and relational databases are still relational databases, there are still plenty of differences!!

I particularly like (as well as row-level locking) that InnoDB is transaction-based, meaning that you may be updating/inserting/creating/altering/dropping/etc several times for one "operation" of your web application. The problem that arises is that if only some of those changes/operations end up being committed, but others do not, you will most times (depending on the specific design of the database) end up with a database with conflicting data/structure.

Note: With Oracle, create/alter/drop statements are called "DDL" (Data Definition) statements, and implicitly trigger a commit. Insert/update/delete statements, called "DML" (Data Manipulation), are not committed automatically, but only when a DDL, commit, or exit/quit is performed (or if you set your session to "auto-commit", or if your client auto-commits). It's imperative to be aware of that when working with Oracle, but I am not sure how MySQL handles the two types of statements. Because of this, I want to make it clear that I'm not sure of this when it comes to MySQL; only with Oracle.

An example of when transaction-based engines excel:

Let's say that I or you are on a web-page to sign up to attend a free event, and one of the main purposes of the system is to only allow up to 100 people to sign up, since that is the limit of the seating for the event. Once 100 sign-ups are reached, the system would disable further signups, at least until others cancel.

In this case, there may be a table for guests (name, phone, email, etc.), and a second table which tracks the number of guests that have signed up. We thus have two operations for one "transaction". Now suppose that after the guest info is added to the GUESTS table, there is a connection loss, or an error with the same impact. The GUESTS table was updated (inserted into), but the connection was lost before the "available seats" could be updated.

Now we have a guest added to the guest table, but the number of available seats is now incorrect (for example, value is 85 when it's actually 84).

Of course there are many ways to handle this, such as tracking available seats with "100 minus number of rows in guests table," or some code that checks that the info is consistent, etc.... But with a transaction-based database engine such as InnoDB, either ALL of the operations are committed, or NONE of them are. This can be helpful in many cases, but like I said, it's not the ONLY way to be safe, no (a nice way, however, handled by the database, not the programmer/script-writer).

That's all "transaction-based" essentially means in this context, unless I'm missing something -- that either the whole transaction succeeds as it should, or nothing is changed, since making only partial changes could make a minor to SEVERE mess of the database, perhaps even corrupting it...

But I'll say it one more time, it's not the only way to avoid making a mess. But it is one of the methods that the engine itself handles, leaving you to code/script with only needing to worry about "was the transaction successful or not, and what do I do if not (such as retry)," instead of manually writing code to check it "manually" from outside of the database, and doing a lot more work for such events.

Lastly, a note about table-locking vs row-locking:

DISCLAIMER: I may be wrong in all that follows in regard to MySQL, and the hypothetical/example situations are things to look into, but I may be wrong in what exactly is possible to cause corruption with MySQL. The examples are however very real in general programming, even if MySQL has more mechanisms to avoid such things...

Anyway, I am fairly confident in agreeing with those who have argued that how many connections are allowed at a time does not work around a locked table. In fact, multiple connections are the entire point of locking a table!! So that other processes/users/apps are not able to corrupt the database by making changes at the same time.

How would two or more connections working on the same row make a REALLY BAD DAY for you?? Suppose there are two processes both want/need to update the same value in the same row, let's say because the row is a record of a bus tour, and each of the two processes simultaneously want to update the "riders" or "available_seats" field as "the current value plus 1."

Let's do this hypothetically, step by step:

  1. Process one reads the current value, let's say it's empty, thus '0' so far.
  2. Process two reads the current value as well, which is still 0.
  3. Process one writes (current + 1) which is 1.
  4. Process two should be writing 2, but since it read the current value before process one write the new value, it too writes 1 to the table.

I'm not certain that two connections could intermingle like that, both reading before the first one writes... But if not, then I would still see a problem with:

  1. Process one reads the current value, which is 0.
  2. Process one writes (current + 1), which is 1.
  3. Process two reads the current value now. But while process one DID write (update), it has not committed the data, thus only that same process can read the new value that it updated, while all others see the older value, until there is a commit.

Also, at least with Oracle databases, there are isolation levels, which I will not waste our time trying to paraphrase. Here is a good article on that subject, and each isolation level having it's pros and cons, which would go along with how important transaction-based engines may be in a database...

Lastly, there may likely be different safeguards in place within MyISAM, instead of foreign-keys and transaction-based interaction. Well, for one, there is the fact that an entire table is locked, which makes it less likely that transactions/FKs are needed .

And alas, if you are aware of these concurrency issues, yes you can play it less safe and just write your applications, set up your systems so that such errors are not possible (your code is then responsible, rather than the database itself). However, in my opinion, I would say that it is always best to use as many safeguards as possible, programming defensively, and always being aware that human error is impossible to completely avoid. It happens to everyone, and anyone who says they are immune to it must be lying, or hasn't done more than write a "Hello World" application/script. ;-)

I hope that SOME of that is helpful to some one, and even more-so, I hope that I have not just now been a culprit of assumptions and being a human in error!! My apologies if so, but the examples are good to think about, research the risk of, and so on, even if they are not potential in this specific context.

Feel free to correct me, edit this "answer," even vote it down. Just please try to improve, rather than correcting a bad assumption of mine with another. ;-)

This is my first response, so please forgive the length due to all the disclaimers, etc... I just don't want to sound arrogant when I am not absolutely certain!


Also check out some drop-in replacements for MySQL itself:

MariaDB

http://mariadb.org/

MariaDB is a database server that offers drop-in replacement functionality for MySQL. MariaDB is built by some of the original authors of MySQL, with assistance from the broader community of Free and open source software developers. In addition to the core functionality of MySQL, MariaDB offers a rich set of feature enhancements including alternate storage engines, server optimizations, and patches.

Percona Server

https://launchpad.net/percona-server

An enhanced drop-in replacement for MySQL, with better performance, improved diagnostics, and added features.


Every application has it's own performance profile for using a database, and chances are it will change over time.

The best thing you can do is to test your options. Switching between MyISAM and InnoDB is trivial, so load some test data and fire jmeter against your site and see what happens.


I know this won't be popular but here goes:

myISAM lacks support for database essentials like transactions and referential integrity which often results in glitchy / buggy applications. You cannot not learn proper database design fundamentals if they are not even supported by your db engine.

Not using referential integrity or transactions in the database world is like not using object oriented programming in the software world.

InnoDB exists now, use that instead! Even MySQL developers have finally conceded to change this to the default engine in newer versions, despite myISAM being the original engine that was the default in all legacy systems.

No it does not matter if you are reading or writing or what performance considerations you have, using myISAM can result in a variety of problems, such as this one I just ran into: I was performing a database sync and at the same time someone else accessed an application that accessed a table set to myISAM. Due to the lack of transaction support and the generally poor reliability of this engine, this crashed the entire database and I had to manually restart mysql!

Over the past 15 years of development I have used many databases and engines. myISAM crashed on me about a dozen times during this period, other databases, only once! And that was a microsoft SQL database where some developer wrote faulty CLR code (common language runtime - basically C# code that executes inside the database) by the way, it was not the database engine's fault exactly.

I agree with the other answers here that say that quality high-availability, high-performance applications should not use myISAM as it will not work, it is not robust or stable enough to result in a frustration-free experience. See Bill Karwin's answer for more details.

PS Gotta love it when myISAM fanboys downvote but can't tell you which part of this answer is incorrect.



I've figure out that even though Myisam has locking contention, it's still faster than InnoDb in most scenarios because of the rapid lock acquisition scheme it uses. I've tried several times Innodb and always fall back to MyIsam for one reason or the other. Also InnoDB can be very CPU intensive in huge write loads.


If it is 70% inserts and 30% reads then it is more like on the InnoDB side.


In short, InnoDB is good if you are working on something that needs a reliable database that can handles a lot of INSERT and UPDATE instructions.

and, MyISAM is good if you needs a database that will mostly be taking a lot of read (SELECT) instructions rather than write (INSERT and UPDATES), considering its drawback on the table-lock thing.

you may want to check out;
Pros and Cons of InnoDB
Pros and Cons of MyISAM


bottomline: if you are working offline with selects on large chunks of data, MyISAM will probably give you better (much better) speeds.

there are some situations when MyISAM is infinitely more efficient than InnoDB: when manipulating large data dumps offline (because of table lock).

example: I was converting a csv file (15M records) from NOAA which uses VARCHAR fields as keys. InnoDB was taking forever, even with large chunks of memory available.

this an example of the csv (first and third fields are keys).

USC00178998,20130101,TMAX,-22,,,7,0700
USC00178998,20130101,TMIN,-117,,,7,0700
USC00178998,20130101,TOBS,-28,,,7,0700
USC00178998,20130101,PRCP,0,T,,7,0700
USC00178998,20130101,SNOW,0,T,,7,

因為我需要做的是運行觀察到的天氣現象的批量脫機更新,我使用MyISAM表接收數據,並在鍵上運行JOINS,以便我可以清理傳入文件並用INT鍵替換VARCHAR字段(它們與存儲原始VARCHAR值的外部表)。


人們經常談論性能,讀寫操作,外鍵等,但我認為存儲引擎還有一個必須具備的特性: 原子更新。

嘗試這個:

  1. 針對您的MyISAM表發出UPDATE,需要5秒鐘。
  2. 當UPDATE正在進行時,比如2.5秒,按下Ctrl-C來中斷它。
  3. 觀察桌子上的效果。 有多少行更新? 有多少沒有更新? 表格是否可讀,或者當您按Ctrl-C時是否損壞?
  4. 針對InnoDB表嘗試使用UPDATE進行的相同實驗,中斷正在進行的查詢。
  5. 觀察InnoDB表。 行更新。 InnoDB保證你有原子更新,如果完全更新無法提交,它會回滾整個更改。 此外,表格並未損壞。 即使您使用killall -9 mysqld來模擬崩潰,這也可以工作。

當然,性能是可取的,但不會丟失數據應該勝過這一點。


如果您使用MyISAM,則每小時不會執行任何事務,除非您認為每個DML語句都是一個事務(在任何情況下,在發生崩潰時這些事務不會是持久的或原子的)。

所以我認為你必須使用InnoDB。

每秒300個事務聽起來很多。 如果您絕對需要這些事務在整個電源故障期間保持持久性,請確保您的I / O子系統可以輕鬆處理每秒多次寫入操作。 您至少需要一個帶電池支持緩存的RAID控制器。

如果您可以承受較小的耐久性,那麼可以使用innodb_flush_log_at_trx_commit設置為0或2的InnoDB(請參閱文檔以獲取詳細信息),您可以提高性能。

有許多補丁可以提高Google和其他人的並發性 - 如果沒有它們仍然無法獲得足夠的性能,這些補丁可能會引起興趣。


我不是數據庫專家,我不會從經驗中發言。 然而:

MyISAM表使用表級鎖定 。 根據您的流量估算,您每秒接近200次寫入。 有了MyISAM, 只有其中一個可以隨時進行 。 您必須確保您的硬件能夠跟上這些事務,以避免超出範圍,即單個查詢最多不超過5毫秒。

這表明你需要一個支持行級鎖定的存儲引擎,例如InnoDB。

另一方面,編寫幾個簡單的腳本來模擬每個存儲引擎的負載,然後比較結果應該是相當簡單的。


我在使用MySQL的高容量系統上工作過,並且我已經嘗試了MyISAM和InnoDB。

我發現MyISAM中的表級鎖定會給我們的工作負載帶來嚴重的性能問題,這聽起來與您的工作負載類似。 不幸的是,我還發現InnoDB下的性能也比我想像的要差。

最後,我通過對數據進行分段來解決爭用問題,例如插入進入“熱”表並選擇從不查詢熱表。

這也允許刪除(數據是時間敏感的,我們只保留X天值)發生在“陳舊”的表格上,這些陳述再次沒有被選擇查詢所觸及。 InnoDB似乎在批量刪除方面表現不佳,因此如果您計劃清除數據,您可能希望以舊數據位於陳舊表格中的方式來構建數據,這樣可以簡單地刪除舊數據,而不用在其上運行刪除操作。

當然,我不知道你的應用程序是什麼,但希望這可以讓你對MyISAM和InnoDB的一些問題有所了解。


有點遲到遊戲......但這是一個我幾個月前寫的非常全面的帖子 ,詳細介紹了MYISAM和InnoDB之間的主要區別。 拿一杯茶(也許是一個餅乾),並享受。

MyISAM和InnoDB之間的主要區別在於參照完整性和交易。 還有其他區別,如鎖定,回滾和全文搜索。

參照完整性

引用完整性可確保表之間的關係保持一致。 更具體地說,這意味著當表(例如,列表)具有指向不同表(例如產品)的外鍵(例如產品ID)時,當指向表發生更新或刪除時,這些改變被級聯到鏈接表。 在我們的例子中,如果產品被重命名,鏈接表的外鍵也會更新; 如果產品從“產品”表中刪除,則任何指向已刪除條目的列表也將被刪除。 此外,任何新的列表都必須將該外鍵指向有效的現有條目。

InnoDB是一個關係數據庫管理系統(RDBMS),因此具有參照完整性,而MyISAM則沒有。

交易和原子

使用數據操縱語言(DML)語句(如SELECT,INSERT,UPDATE和DELETE)來管理表中的數據。 一個事務組將兩個或更多的DML語句合併為一個單一的工作單元,因此整個單元都被應用,或者它們都不是。

MyISAM不支持事務,而InnoDB則支持事務。

如果操作在使用MyISAM表時中斷,操作立即中止,即使操作沒有完成,受影響的行(甚至每行內的數據)仍然受到影響。

如果一個操作在使用InnoDB表時被中斷,因為它使用具有原子性的事務,任何未完成的事務都不會生效,因為沒有提交。

表鎖與行鎖

當查詢針對MyISAM表運行時,它所查詢的整個表將被鎖定。 這意味著後續查詢只能在當前查詢完成後執行。 如果您正在閱讀大型表格和/或頻繁進行讀寫操作,則這可能意味著查詢積壓。

當查詢針對InnoDB表運行時,只有涉及的行被鎖定,表的其餘部分仍然可用於CRUD操作。 這意味著查詢可以在同一個表上同時運行,只要它們不使用同一行。

InnoDB中的這個特性被稱為並發性。 與並發一樣好,有一個主要缺點是適用於選定範圍的表,因為在內核線程之間切換時會有開銷,並且應該對內核線程設置限制以防止服務器停止。

交易和回滾

當你在MyISAM中運行一個操作時,設置的變化; 在InnoDB中,這些更改可以回滾。 用於控制事務的最常用命令是COMMIT,ROLLBACK和SAVEPOINT。 1. COMMIT - 您可以編寫多個DML操作,但只有在執行COMMIT時才會保存更改2. ROLLBACK - 您可以放棄尚未提交的任何操作3. SAVEPOINT - 在列表中設置一個點ROLLBACK操作可以回滾到的操作

可靠性

MyISAM不提供數據完整性 - 硬件故障,不清潔的關機和取消的操作可能導致數據損壞。 這將需要完全修復或重建索引和表格。

另一方面,InnoDB使用事務日誌,雙寫緩衝區和自動校驗和驗證來防止腐敗。 在InnoDB進行任何更改之前,它將事務之前的數據記錄到名為ibdata1的系統表空間文件中。 如果發生崩潰,InnoDB將通過重播這些日誌來自動恢復。

FULLTEXT索引

在MySQL 5.6.4版之前,InnoDB不支持FULLTEXT索引。 在寫這篇文章時,許多共享主機提供商的MySQL版本仍低於5.6.4,這意味著InnoDB表不支持FULLTEXT索引。

但是,這不是使用MyISAM的有效理由。 最好轉換為支持最新版本MySQL的主機提供商。 不是說使用FULLTEXT索引的MyISAM表不能轉換為InnoDB表。

結論

總之,InnoDB應該是您選擇的默認存儲引擎。 滿足特定需求時選擇MyISAM或其他數據類型。


為了在這裡涵蓋兩種發動機之間機械差異的廣泛選擇,我提出了一個經驗速度比較研究。

就純速度而言,並不總是MyISAM比InnoDB更快,但根據我的經驗,PURE READ工作環境的速度往往要快大約2.0-2.5倍。 顯然,這不適用於所有環境 - 正如其他人所寫的,MyISAM缺乏交易和外鍵等事情。

我在下面做了一些基準測試 - 我使用python進行循環,並使用timeit庫進行時序比較。 為了提高興趣,我還包含了內存引擎,儘管它只適用於較小的表格(您不斷遇到超過MySQL內存限制The table 'tbl' is full ,但它在整個主板上提供了最佳性能。 我選擇的四種類型是:

  1. 香草SELECTs
  2. 計數
  3. 條件SELECTs
  4. 索引和非索引子選擇

首先,我使用以下SQL創建了三個表

CREATE TABLE
    data_interrogation.test_table_myisam
    (
        index_col BIGINT NOT NULL AUTO_INCREMENT,
        value1 DOUBLE,
        value2 DOUBLE,
        value3 DOUBLE,
        value4 DOUBLE,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8

'MyISAM'在第二個和第三個表中替換了'InnoDB'和'內存'。

1)香草選擇

查詢: SELECT * FROM tbl WHERE index_col = xx

結果: 繪製

這些速度都大致相同,並且如所期望的那樣,在要選擇的列的數量上是線性的。 InnoDB似乎比MyISAM 稍微快一些,但這實際上很微不足道。

碼:

import timeit
import MySQLdb
import MySQLdb.cursors
import random
from random import randint

db = MySQLdb.connect(host="...", user="...", passwd="...", db="...", cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()

lengthOfTable = 100000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)
    cur.execute(insertString3)

db.commit()

# Define a function to pull a certain number of records from these tables
def selectRandomRecords(testTable,numberOfRecords):

    for x in xrange(numberOfRecords):
        rand1 = randint(0,lengthOfTable)

        selectString = "SELECT * FROM " + testTable + " WHERE index_col = " + str(rand1)
        cur.execute(selectString)

setupString = "from __main__ import selectRandomRecords"

# Test time taken using timeit
myisam_times = []
innodb_times = []
memory_times = []

for theLength in [3,10,30,100,300,1000,3000,10000]:

    innodb_times.append( timeit.timeit('selectRandomRecords("test_table_innodb",' + str(theLength) + ')', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('selectRandomRecords("test_table_myisam",' + str(theLength) + ')', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('selectRandomRecords("test_table_memory",' + str(theLength) + ')', number=100, setup=setupString) )

2)計數

查詢: SELECT count(*) FROM tbl

結果: MyISAM獲勝

這表明了MyISAM和InnoDB之間的巨大差異 - MyISAM(和內存)跟踪表中的記錄數,所以這個事務很快並且O(1)。 InnoDB計算所需的時間量在我調查的範圍內隨表格大小呈超線性增長。 我懷疑在實踐中觀察到的MyISAM查詢中的許多加速都是由於類似的影響。

碼:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to count the records
def countRecords(testTable):

    selectString = "SELECT count(*) FROM " + testTable
    cur.execute(selectString)

setupString = "from __main__ import countRecords"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('countRecords("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('countRecords("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('countRecords("test_table_memory")', number=100, setup=setupString) )

3)條件選擇

查詢: SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5

結果: MyISAM獲勝

在這裡,MyISAM和內存執行大致相同,並且對於較大的表,大約50%擊敗了InnoDB。 這是MyISAM的好處似乎被最大化的那種查詢。

碼:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to perform conditional selects
def conditionalSelect(testTable):
    selectString = "SELECT * FROM " + testTable + " WHERE value1 < 0.5 AND value2 < 0.5 AND value3 < 0.5 AND value4 < 0.5"
    cur.execute(selectString)

setupString = "from __main__ import conditionalSelect"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('conditionalSelect("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('conditionalSelect("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('conditionalSelect("test_table_memory")', number=100, setup=setupString) )

4)子選擇

結果: InnoDB獲勝

對於這個查詢,我為子選擇創建了一組額外的表。 每個簡單的兩列BIGINT,一個帶有主鍵索引,另一個沒有索引。 由於桌子很大,我沒有測試內存引擎。 SQL表創建命令是

CREATE TABLE
    subselect_myisam
    (
        index_col bigint NOT NULL,
        non_index_col bigint,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8;

在第二個表中,'MyISAM'再次代替'InnoDB'。

在此查詢中,我將選擇表的大小保留為1000000,而是改變了子選定列的大小。

InnoDB在這裡很容易獲勝。 在我們得到合理大小的表格後,兩個引擎都按照子選擇的大小線性縮放。 索引加速了MyISAM命令,但有趣的是對InnoDB速度影響不大。 subSelect.png

碼:

myisam_times = []
innodb_times = []
myisam_times_2 = []
innodb_times_2 = []

def subSelectRecordsIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString = "from __main__ import subSelectRecordsIndexed"

def subSelectRecordsNotIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT non_index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString2 = "from __main__ import subSelectRecordsNotIndexed"

# Truncate the old tables, and re-fill with 1000000 records
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"

cur.execute(truncateString)
cur.execute(truncateString2)

lengthOfTable = 1000000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)

for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE subselect_innodb"
    truncateString2 = "TRUNCATE subselect_myisam"

    cur.execute(truncateString)
    cur.execute(truncateString2)

    # For each length, empty the table and re-fill it with random data
    rand_sample = sorted(random.sample(xrange(lengthOfTable), theLength))
    rand_sample_2 = random.sample(xrange(lengthOfTable), theLength)

    for (the_value_1,the_value_2) in zip(rand_sample,rand_sample_2):
        insertString = "INSERT INTO subselect_innodb (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"
        insertString2 = "INSERT INTO subselect_myisam (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)

    db.commit()

    # Finally, time the queries
    innodb_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString) )

    innodb_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString2) )
    myisam_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString2) )

我認為,所有這一切的結果都是,如果你真的關心速度,你需要對你正在做的查詢進行基準測試,而不是假設哪個引擎更適合。







myisam