mysql锁 - MyISAM与InnoDB




mysql myisam (18)

我正在研究一个涉及大量数据库写入的项目,我会说( 70%的插入和30%的读取 )。 这个比例还包括我认为是一次读取和一次写入的更新。 读取可能很脏(例如,在阅读时我不需要100%准确的信息)。
有关任务将每小时完成超过100万次数据库事务。

我已经阅读了网上关于MyISAM和InnoDB之间差异的一些内容,MyISAM似乎是我将用于此任务的特定数据库/表的明显选择。 从我似乎正在阅读的内容来看,如果支持行级锁定,需要事务处理,InnoDB就很好。

有人对此类负载(或更高)有任何经验吗? MyISAM是否要走?


InnoDB提供:

ACID transactions
row-level locking
foreign key constraints
automatic crash recovery
table compression (read/write)
spatial data types (no spatial indexes)

在InnoDB中,除TEXT和BLOB以外的所有行中的数据最多可以占用8,000个字节。 InnoDB没有全文索引。 在InnoDB中,COUNT(*)s(当不使用WHERE,GROUP BY或JOIN时)的执行速度比MyISAM慢,因为行数不是内部存储的。 InnoDB将数据和索引存储在一个文件中。 InnoDB使用缓冲池来缓存数据和索引。

MyISAM提供:

fast COUNT(*)s (when WHERE, GROUP BY, or JOIN is not used)
full text indexing
smaller disk footprint
very high table compression (read only)
spatial data types and indexes (R-tree)

MyISAM具有表级锁定,但没有行级锁定。 没有交易。 没有自动崩溃恢复,但它确实提供修复表功能。 没有外键约束。 与InnoDB表相比,MyISAM表通常在磁盘上更紧凑。 如果需要,可以通过使用myisampack进行压缩,从而使MyISAM表进一步大幅缩小,但只能变为只读。 MyISAM将索引存储在一个文件中,将数据存储在另一个文件中 MyISAM使用密钥缓冲区来缓存索引,并将数据缓存管理留给操作系统。

总的来说,我会推荐InnoDB用于大多数用途,而MyISAM仅用于专业用途。 InnoDB现在是新版MySQL中的默认引擎。


The Question and most of the Answers are out of date .

Yes, it is an old wives' tale that MyISAM is faster than InnoDB. notice the Question's date: 2008; it is now almost a decade later. InnoDB has made significant performance strides since then.

The dramatic graph was for the one case where MyISAM wins: COUNT(*) without a WHERE clause. But is that really what you spend your time doing?

If you run concurrency test, InnoDB is very likely to win, even against MEMORY .

If you do any writes while benchmarking SELECTs , MyISAM and MEMORY are likely to lose because of table-level locking.

In fact, Oracle is so sure that InnoDB is better that they removed MyISAM from 8.0!

The Question was written early in the days of 5.1. Since then, these major versions were marked "General Availability":

  • 2010: 5.5 (.8 in Dec.)
  • 2013: 5.6 (.10 in Feb.)
  • 2015: 5.7 (.9 in Oct.)
  • [TBD, maybe 2018], 8.0

Bottom line: Don't use MyISAM


MyISAM

The MyISAM engine is the default engine in most MySQL installations and is a derivative of the original ISAM engine type supported in the early versions of the MySQL system. The engine provides the best combination of performance and functionality, although it lacks transaction capabilities (use the InnoDB or BDB engines) and uses table-level locking .

FlashMAX and FlashMAX Connect: Leading the Flash Platform Transformation Download Now Unless you need transactions, there are few databases and applications that cannot effectively be stored using the MyISAM engine. However, very high-performance applications where there are large numbers of data inserts/updates compared to the number of reads can cause performance proboelsm for the MyISAM engine. It was originally designed with the idea that more than 90% of the database access to a MyISAM table would be reads, rather than writes.

With table-level locking, a database with a high number of row inserts or updates becomes a performance bottleneck as the table is locked while data is added. Luckily this limitation also works well within the restrictions of a non-transaction database.

MyISAM Summary

Name -MyISAM

Introduced -v3.23

Default install -Yes

Data limitations -None

Index limitations -64 indexes per table (32 pre 4.1.2); Max 16 columns per index

Transaction support -No

Locking level -Table

InnoDB

The InnoDB Engine is provided by Innobase Oy and supports all of the database functionality (and more) of MyISAM engine and also adds full transaction capabilities (with full ACID (Atomicity, Consistency, Isolation, and Durability) compliance) and row level locking of data.

The key to the InnoDB system is a database, caching and indexing structure where both indexes and data are cached in memory as well as being stored on disk. This enables very fast recovery, and works even on very large data sets. By supporting row level locking, you can add data to an InnoDB table without the engine locking the table with each insert and this speeds up both the recovery and storage of information in the database.

As with MyISAM , there are few data types that cannot effectively be stored in an InnoDB database. In fact, there are no significant reasons why you shouldn't always use an InnoDB database. The management overhead for InnoDB is slightly more onerous, and getting the optimization right for the sizes of in-memory and on disk caches and database files can be complex at first. However, it also means that you get more flexibility over these values and once set, the performance benefits can easily outweigh the initial time spent. Alternatively, you can let MySQL manage this automatically for you.

If you are willing (and able) to configure the InnoDB settings for your server, then I would recommend that you spend the time to optimize your server configuration and then use the InnoDB engine as the default.

InnoDB Summary

Name -InnoDB

Introduced -v3.23 (source only), v4.0 (source and binary)

Default install -No

Data limitations -None

Index limitations -None

Transaction support -Yes (ACID compliant)

Locking level -Row


Please note that my formal education and experience is with Oracle, while my work with MySQL has been entirely personal and on my own time, so if I say things that are true for Oracle but are not true for MySQL, I apologize. While the two systems share a lot, the relational theory/algebra is the same, and relational databases are still relational databases, there are still plenty of differences!!

I particularly like (as well as row-level locking) that InnoDB is transaction-based, meaning that you may be updating/inserting/creating/altering/dropping/etc several times for one "operation" of your web application. The problem that arises is that if only some of those changes/operations end up being committed, but others do not, you will most times (depending on the specific design of the database) end up with a database with conflicting data/structure.

Note: With Oracle, create/alter/drop statements are called "DDL" (Data Definition) statements, and implicitly trigger a commit. Insert/update/delete statements, called "DML" (Data Manipulation), are not committed automatically, but only when a DDL, commit, or exit/quit is performed (or if you set your session to "auto-commit", or if your client auto-commits). It's imperative to be aware of that when working with Oracle, but I am not sure how MySQL handles the two types of statements. Because of this, I want to make it clear that I'm not sure of this when it comes to MySQL; only with Oracle.

An example of when transaction-based engines excel:

Let's say that I or you are on a web-page to sign up to attend a free event, and one of the main purposes of the system is to only allow up to 100 people to sign up, since that is the limit of the seating for the event. Once 100 sign-ups are reached, the system would disable further signups, at least until others cancel.

In this case, there may be a table for guests (name, phone, email, etc.), and a second table which tracks the number of guests that have signed up. We thus have two operations for one "transaction". Now suppose that after the guest info is added to the GUESTS table, there is a connection loss, or an error with the same impact. The GUESTS table was updated (inserted into), but the connection was lost before the "available seats" could be updated.

Now we have a guest added to the guest table, but the number of available seats is now incorrect (for example, value is 85 when it's actually 84).

Of course there are many ways to handle this, such as tracking available seats with "100 minus number of rows in guests table," or some code that checks that the info is consistent, etc.... But with a transaction-based database engine such as InnoDB, either ALL of the operations are committed, or NONE of them are. This can be helpful in many cases, but like I said, it's not the ONLY way to be safe, no (a nice way, however, handled by the database, not the programmer/script-writer).

That's all "transaction-based" essentially means in this context, unless I'm missing something -- that either the whole transaction succeeds as it should, or nothing is changed, since making only partial changes could make a minor to SEVERE mess of the database, perhaps even corrupting it...

But I'll say it one more time, it's not the only way to avoid making a mess. But it is one of the methods that the engine itself handles, leaving you to code/script with only needing to worry about "was the transaction successful or not, and what do I do if not (such as retry)," instead of manually writing code to check it "manually" from outside of the database, and doing a lot more work for such events.

Lastly, a note about table-locking vs row-locking:

DISCLAIMER: I may be wrong in all that follows in regard to MySQL, and the hypothetical/example situations are things to look into, but I may be wrong in what exactly is possible to cause corruption with MySQL. The examples are however very real in general programming, even if MySQL has more mechanisms to avoid such things...

Anyway, I am fairly confident in agreeing with those who have argued that how many connections are allowed at a time does not work around a locked table. In fact, multiple connections are the entire point of locking a table!! So that other processes/users/apps are not able to corrupt the database by making changes at the same time.

How would two or more connections working on the same row make a REALLY BAD DAY for you?? Suppose there are two processes both want/need to update the same value in the same row, let's say because the row is a record of a bus tour, and each of the two processes simultaneously want to update the "riders" or "available_seats" field as "the current value plus 1."

Let's do this hypothetically, step by step:

  1. Process one reads the current value, let's say it's empty, thus '0' so far.
  2. Process two reads the current value as well, which is still 0.
  3. Process one writes (current + 1) which is 1.
  4. Process two should be writing 2, but since it read the current value before process one write the new value, it too writes 1 to the table.

I'm not certain that two connections could intermingle like that, both reading before the first one writes... But if not, then I would still see a problem with:

  1. Process one reads the current value, which is 0.
  2. Process one writes (current + 1), which is 1.
  3. Process two reads the current value now. But while process one DID write (update), it has not committed the data, thus only that same process can read the new value that it updated, while all others see the older value, until there is a commit.

Also, at least with Oracle databases, there are isolation levels, which I will not waste our time trying to paraphrase. Here is a good article on that subject, and each isolation level having it's pros and cons, which would go along with how important transaction-based engines may be in a database...

Lastly, there may likely be different safeguards in place within MyISAM, instead of foreign-keys and transaction-based interaction. Well, for one, there is the fact that an entire table is locked, which makes it less likely that transactions/FKs are needed .

And alas, if you are aware of these concurrency issues, yes you can play it less safe and just write your applications, set up your systems so that such errors are not possible (your code is then responsible, rather than the database itself). However, in my opinion, I would say that it is always best to use as many safeguards as possible, programming defensively, and always being aware that human error is impossible to completely avoid. It happens to everyone, and anyone who says they are immune to it must be lying, or hasn't done more than write a "Hello World" application/script. ;-)

I hope that SOME of that is helpful to some one, and even more-so, I hope that I have not just now been a culprit of assumptions and being a human in error!! My apologies if so, but the examples are good to think about, research the risk of, and so on, even if they are not potential in this specific context.

Feel free to correct me, edit this "answer," even vote it down. Just please try to improve, rather than correcting a bad assumption of mine with another. ;-)

This is my first response, so please forgive the length due to all the disclaimers, etc... I just don't want to sound arrogant when I am not absolutely certain!


Also check out some drop-in replacements for MySQL itself:

MariaDB

http://mariadb.org/

MariaDB is a database server that offers drop-in replacement functionality for MySQL. MariaDB is built by some of the original authors of MySQL, with assistance from the broader community of Free and open source software developers. In addition to the core functionality of MySQL, MariaDB offers a rich set of feature enhancements including alternate storage engines, server optimizations, and patches.

Percona Server

https://launchpad.net/percona-server

An enhanced drop-in replacement for MySQL, with better performance, improved diagnostics, and added features.


Every application has it's own performance profile for using a database, and chances are it will change over time.

The best thing you can do is to test your options. Switching between MyISAM and InnoDB is trivial, so load some test data and fire jmeter against your site and see what happens.


I know this won't be popular but here goes:

myISAM lacks support for database essentials like transactions and referential integrity which often results in glitchy / buggy applications. You cannot not learn proper database design fundamentals if they are not even supported by your db engine.

Not using referential integrity or transactions in the database world is like not using object oriented programming in the software world.

InnoDB exists now, use that instead! Even MySQL developers have finally conceded to change this to the default engine in newer versions, despite myISAM being the original engine that was the default in all legacy systems.

No it does not matter if you are reading or writing or what performance considerations you have, using myISAM can result in a variety of problems, such as this one I just ran into: I was performing a database sync and at the same time someone else accessed an application that accessed a table set to myISAM. Due to the lack of transaction support and the generally poor reliability of this engine, this crashed the entire database and I had to manually restart mysql!

Over the past 15 years of development I have used many databases and engines. myISAM crashed on me about a dozen times during this period, other databases, only once! And that was a microsoft SQL database where some developer wrote faulty CLR code (common language runtime - basically C# code that executes inside the database) by the way, it was not the database engine's fault exactly.

I agree with the other answers here that say that quality high-availability, high-performance applications should not use myISAM as it will not work, it is not robust or stable enough to result in a frustration-free experience. See Bill Karwin's answer for more details.

PS Gotta love it when myISAM fanboys downvote but can't tell you which part of this answer is incorrect.



I've figure out that even though Myisam has locking contention, it's still faster than InnoDb in most scenarios because of the rapid lock acquisition scheme it uses. I've tried several times Innodb and always fall back to MyIsam for one reason or the other. Also InnoDB can be very CPU intensive in huge write loads.


If it is 70% inserts and 30% reads then it is more like on the InnoDB side.


In short, InnoDB is good if you are working on something that needs a reliable database that can handles a lot of INSERT and UPDATE instructions.

and, MyISAM is good if you needs a database that will mostly be taking a lot of read (SELECT) instructions rather than write (INSERT and UPDATES), considering its drawback on the table-lock thing.

you may want to check out;
Pros and Cons of InnoDB
Pros and Cons of MyISAM


bottomline: if you are working offline with selects on large chunks of data, MyISAM will probably give you better (much better) speeds.

there are some situations when MyISAM is infinitely more efficient than InnoDB: when manipulating large data dumps offline (because of table lock).

example: I was converting a csv file (15M records) from NOAA which uses VARCHAR fields as keys. InnoDB was taking forever, even with large chunks of memory available.

this an example of the csv (first and third fields are keys).

USC00178998,20130101,TMAX,-22,,,7,0700
USC00178998,20130101,TMIN,-117,,,7,0700
USC00178998,20130101,TOBS,-28,,,7,0700
USC00178998,20130101,PRCP,0,T,,7,0700
USC00178998,20130101,SNOW,0,T,,7,

因为我需要做的是运行观察到的天气现象的批量脱机更新,我使用MyISAM表接收数据并在键上运行JOINS,以便我可以清理传入文件并用INT键替换VARCHAR字段(这些字段与存储原始VARCHAR值的外部表)。


为了在这里涵盖两种发动机之间机械差异的广泛选择,我提出了一个经验的速度比较研究。

就纯速度而言,并不总是MyISAM比InnoDB更快,但根据我的经验,PURE READ工作环境的速度往往要快大约2.0-2.5倍。 显然,这不适用于所有环境 - 正如其他人所写的,MyISAM缺乏交易和外键等事情。

我在下面做了一些基准测试 - 我使用python进行循环,并使用timeit库进行时序比较。 为了提高兴趣,我还包含了内存引擎,尽管它只适用于较小的表格(您不断遇到超过MySQL内存限制The table 'tbl' is full ,但它在整个主板上提供了最佳性能。 我选择的四种类型是:

  1. 香草SELECTs
  2. 计数
  3. 条件SELECTs
  4. 索引和非索引子选择

首先,我使用以下SQL创建了三个表

CREATE TABLE
    data_interrogation.test_table_myisam
    (
        index_col BIGINT NOT NULL AUTO_INCREMENT,
        value1 DOUBLE,
        value2 DOUBLE,
        value3 DOUBLE,
        value4 DOUBLE,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8

'MyISAM'在第二个和第三个表中替换了'InnoDB'和'内存'。

1)香草选择

查询: SELECT * FROM tbl WHERE index_col = xx

结果: 绘制

这些的速度大致相同,并且如所期望的那样,在要选择的列的数量上是线性的。 InnoDB似乎比MyISAM 稍微快一些,但这实际上很微不足道。

码:

import timeit
import MySQLdb
import MySQLdb.cursors
import random
from random import randint

db = MySQLdb.connect(host="...", user="...", passwd="...", db="...", cursorclass=MySQLdb.cursors.DictCursor)
cur = db.cursor()

lengthOfTable = 100000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)
    cur.execute(insertString3)

db.commit()

# Define a function to pull a certain number of records from these tables
def selectRandomRecords(testTable,numberOfRecords):

    for x in xrange(numberOfRecords):
        rand1 = randint(0,lengthOfTable)

        selectString = "SELECT * FROM " + testTable + " WHERE index_col = " + str(rand1)
        cur.execute(selectString)

setupString = "from __main__ import selectRandomRecords"

# Test time taken using timeit
myisam_times = []
innodb_times = []
memory_times = []

for theLength in [3,10,30,100,300,1000,3000,10000]:

    innodb_times.append( timeit.timeit('selectRandomRecords("test_table_innodb",' + str(theLength) + ')', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('selectRandomRecords("test_table_myisam",' + str(theLength) + ')', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('selectRandomRecords("test_table_memory",' + str(theLength) + ')', number=100, setup=setupString) )

2)计数

查询: SELECT count(*) FROM tbl

结果: MyISAM获胜

这表明了MyISAM和InnoDB之间的巨大差异 - MyISAM(和内存)跟踪表中的记录数,所以这个事务很快并且O(1)。 InnoDB计算所需的时间量在我调查的范围内随表格大小呈超线性增长。 我怀疑在实践中观察到的MyISAM查询中的许多加速都是由于类似的影响。

码:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to count the records
def countRecords(testTable):

    selectString = "SELECT count(*) FROM " + testTable
    cur.execute(selectString)

setupString = "from __main__ import countRecords"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('countRecords("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('countRecords("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('countRecords("test_table_memory")', number=100, setup=setupString) )

3)条件选择

查询: SELECT * FROM tbl WHERE value1<0.5 AND value2<0.5 AND value3<0.5 AND value4<0.5

结果: MyISAM获胜

在这里,MyISAM和内存执行大致相同,并且对于较大的表,大约50%击败了InnoDB。 这是MyISAM的好处似乎被最大化的那种查询。

码:

myisam_times = []
innodb_times = []
memory_times = []

# Define a function to perform conditional selects
def conditionalSelect(testTable):
    selectString = "SELECT * FROM " + testTable + " WHERE value1 < 0.5 AND value2 < 0.5 AND value3 < 0.5 AND value4 < 0.5"
    cur.execute(selectString)

setupString = "from __main__ import conditionalSelect"

# Truncate the tables and re-fill with a set amount of data
for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE test_table_innodb"
    truncateString2 = "TRUNCATE test_table_myisam"
    truncateString3 = "TRUNCATE test_table_memory"

    cur.execute(truncateString)
    cur.execute(truncateString2)
    cur.execute(truncateString3)

    for x in xrange(theLength):
        rand1 = random.random()
        rand2 = random.random()
        rand3 = random.random()
        rand4 = random.random()

        insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
        insertString3 = "INSERT INTO test_table_memory (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)
        cur.execute(insertString3)

    db.commit()

    # Count and time the query
    innodb_times.append( timeit.timeit('conditionalSelect("test_table_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('conditionalSelect("test_table_myisam")', number=100, setup=setupString) )
    memory_times.append( timeit.timeit('conditionalSelect("test_table_memory")', number=100, setup=setupString) )

4)子选择

结果: InnoDB获胜

对于这个查询,我为子选择创建了一组额外的表。 每个简单的两列BIGINT,一个带有主键索引,另一个没有索引。 由于桌子很大,我没有测试内存引擎。 SQL表创建命令是

CREATE TABLE
    subselect_myisam
    (
        index_col bigint NOT NULL,
        non_index_col bigint,
        PRIMARY KEY (index_col)
    )
    ENGINE=MyISAM DEFAULT CHARSET=utf8;

在第二个表中,'MyISAM'再次代替'InnoDB'。

在此查询中,我将选择表的大小保留为1000000,而是改变了子选定列的大小。

InnoDB在这里很容易获胜。 在我们得到合理大小的表格后,两个引擎都按照子选择的大小线性缩放。 索引加速了MyISAM命令,但有趣的是对InnoDB速度影响不大。 subSelect.png

码:

myisam_times = []
innodb_times = []
myisam_times_2 = []
innodb_times_2 = []

def subSelectRecordsIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString = "from __main__ import subSelectRecordsIndexed"

def subSelectRecordsNotIndexed(testTable,testSubSelect):
    selectString = "SELECT * FROM " + testTable + " WHERE index_col in ( SELECT non_index_col FROM " + testSubSelect + " )"
    cur.execute(selectString)

setupString2 = "from __main__ import subSelectRecordsNotIndexed"

# Truncate the old tables, and re-fill with 1000000 records
truncateString = "TRUNCATE test_table_innodb"
truncateString2 = "TRUNCATE test_table_myisam"

cur.execute(truncateString)
cur.execute(truncateString2)

lengthOfTable = 1000000

# Fill up the tables with random data
for x in xrange(lengthOfTable):
    rand1 = random.random()
    rand2 = random.random()
    rand3 = random.random()
    rand4 = random.random()

    insertString = "INSERT INTO test_table_innodb (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"
    insertString2 = "INSERT INTO test_table_myisam (value1,value2,value3,value4) VALUES (" + str(rand1) + "," + str(rand2) + "," + str(rand3) + "," + str(rand4) + ")"

    cur.execute(insertString)
    cur.execute(insertString2)

for theLength in [3,10,30,100,300,1000,3000,10000,30000,100000]:

    truncateString = "TRUNCATE subselect_innodb"
    truncateString2 = "TRUNCATE subselect_myisam"

    cur.execute(truncateString)
    cur.execute(truncateString2)

    # For each length, empty the table and re-fill it with random data
    rand_sample = sorted(random.sample(xrange(lengthOfTable), theLength))
    rand_sample_2 = random.sample(xrange(lengthOfTable), theLength)

    for (the_value_1,the_value_2) in zip(rand_sample,rand_sample_2):
        insertString = "INSERT INTO subselect_innodb (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"
        insertString2 = "INSERT INTO subselect_myisam (index_col,non_index_col) VALUES (" + str(the_value_1) + "," + str(the_value_2) + ")"

        cur.execute(insertString)
        cur.execute(insertString2)

    db.commit()

    # Finally, time the queries
    innodb_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString) )
    myisam_times.append( timeit.timeit('subSelectRecordsIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString) )

    innodb_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_innodb","subselect_innodb")', number=100, setup=setupString2) )
    myisam_times_2.append( timeit.timeit('subSelectRecordsNotIndexed("test_table_myisam","subselect_myisam")', number=100, setup=setupString2) )

我认为,所有这些信息都是如果你真的关心速度,你需要对你正在做的查询进行基准测试,而不是假设哪个引擎更适合。


人们经常谈论性能,读写操作,外键等,但我认为存储引擎还有一个必须具备的特性: 原子更新。

尝试这个:

  1. 针对您的MyISAM表发出UPDATE,需要5秒钟。
  2. 当UPDATE正在进行时,比如2.5秒,按下Ctrl-C来中断它。
  3. 观察桌子上的效果。 有多少行更新? 有多少没有更新? 表格是否可读,或者当您按Ctrl-C时是否损坏?
  4. 针对InnoDB表尝试使用UPDATE进行的相同实验,中断正在进行的查询。
  5. 观察InnoDB表。 行更新。 InnoDB保证你有原子更新,如果完全更新无法提交,它会回滚整个更改。 此外,表格并未损坏。 即使您使用killall -9 mysqld来模拟崩溃,这也可以工作。

当然,性能是可取的,但不会丢失数据应该胜过这一点。


对于具有更多写入和读取的负载,您将受益于InnoDB。 因为InnoDB提供了行锁定而不是表锁定,所以你的SELECT可以是并发的,不仅仅是彼此之间,而且还有许多INSERT 。 但是,除非您打算使用SQL事务,否则将InnoDB提交刷新设置为2( innodb_flush_log_at_trx_commit )。 这会让您回想起将MyISAM中的表移动到InnoDB时会失去的许多原始性能。

另外,考虑添加复制。 这给你一些阅读缩放,因为你声明你的阅读不必是最新的,你可以让复制落后一点。 只要确保它能够在最繁忙的交通状况下迎头赶上,或者它总是会落后并永远赶不上。 但是,如果您采用这种方式,我强烈建议您将从服务器的读取和复制延迟管理隔离到数据库处理程序。 如果应用程序代码不知道这个,它就简单多了。

最后,了解不同的表格加载。 所有表上的读写比例不会相同。 一些接近100%的较小表格可以保留MyISAM。 同样,如果您有一些接近100%写入的表,您可能从INSERT DELAYED受益,但仅在MyISAM中受支持(对于InnoDB表, DELAYED子句将被忽略)。

但基准是肯定的。


我不是数据库专家,我不会从经验中发言。 然而:

MyISAM表使用表级锁定 。 根据您的流量估算,您每秒接近200次写入。 有了MyISAM, 只有其中一个可以随时进行 。 您必须确保您的硬件能够跟上这些事务,以避免超出范围,即单个查询最多不超过5毫秒。

这表明你需要一个支持行级锁定的存储引擎,例如InnoDB。

另一方面,编写几个简单的脚本来模拟每个存储引擎的负载,然后比较结果应该是相当简单的。


我已经在表格中简要地discussed这个问题,所以你可以总结一下是否要使用InnoDB或者MyISAM

以下是在哪种情况下应该使用哪个db存储引擎的小概述:

                                                 MyISAM   InnoDB
----------------------------------------------------------------
Required full-text search                        Yes      5.6.4
----------------------------------------------------------------
Require transactions                                      Yes
----------------------------------------------------------------
Frequent select queries                          Yes      
----------------------------------------------------------------
Frequent insert, update, delete                           Yes
----------------------------------------------------------------
Row locking (multi processing on single table)            Yes
----------------------------------------------------------------
Relational base design                                    Yes

总结:

Frequent reading, almost no writing   => MyISAM
Full-text search in MySQL <= 5.5      => MyISAM

在所有其他情况下, InnoDB通常是最好的选择。


有点迟到游戏......但这是一个我几个月前写的相当全面的帖子 ,详细介绍了MYISAM和InnoDB之间的主要区别。 拿一杯茶(也许是一个饼干),并享受。

MyISAM和InnoDB之间的主要区别在于参照完整性和交易。 还有其他区别,如锁定,回滚和全文搜索。

参照完整性

引用完整性可确保表之间的关系保持一致。 更具体地说,这意味着当表(例如,列表)具有指向不同表(例如产品)的外键(例如产品ID)时,当指向表发生更新或删除时,这些改变被级联到链接表。 在我们的例子中,如果产品被重命名,链接表的外键也会更新; 如果产品从“产品”表中删除,则任何指向已删除条目的列表也将被删除。 此外,任何新的列表都必须将该外键指向有效的现有条目。

InnoDB是一个关系数据库管理系统(RDBMS),因此具有参照完整性,而MyISAM则没有。

交易和原子性

使用数据操纵语言(DML)语句(例如SELECT,INSERT,UPDATE和DELETE)来管理表中的数据。 一个事务组将两个或更多的DML语句合并为一个单一的工作单元,因此整个单元都被应用,或者它们都不是。

MyISAM不支持事务,而InnoDB则支持事务。

如果操作在使用MyISAM表时中断,操作立即中止,即使操作没有完成,受影响的行(甚至每行内的数据)仍然受到影响。

如果一个操作在使用InnoDB表时被中断,因为它使用具有原子性的事务,任何未完成的事务都不会生效,因为没有提交。

表锁与行锁

当查询针对MyISAM表运行时,它所查询的整个表将被锁定。 这意味着后续查询只能在当前查询完成后执行。 如果您正在阅读大型表格和/或频繁进行读写操作,则这可能意味着查询积压。

当查询针对InnoDB表运行时,只有涉及的行被锁定,表的其余部分仍然可用于CRUD操作。 这意味着查询可以在同一个表上同时运行,只要它们不使用同一行。

InnoDB中的这个特性被称为并发性。 与并发一样好,有一个主要缺点是适用于选定范围的表,因为在内核线程之间切换时会有开销,并且应该对内核线程设置限制以防止服务器停止。

交易和回滚

在MyISAM中运行操作时,会更改设置; 在InnoDB中,这些更改可以回滚。 用于控制事务的最常用命令是COMMIT,ROLLBACK和SAVEPOINT。 1. COMMIT - 您可以编写多个DML操作,但只有在执行COMMIT时才会保存更改2. ROLLBACK - 您可以放弃尚未提交的任何操作3. SAVEPOINT - 在列表中设置一个点ROLLBACK操作可以回滚到的操作

可靠性

MyISAM不提供数据完整性 - 硬件故障,不清洁的关机和取消的操作可能导致数据损坏。 这将需要完全修复或重建索引和表格。

另一方面,InnoDB使用事务日志,双写缓冲区和自动校验和验证来防止腐败。 在InnoDB进行任何更改之前,它将事务之前的数据记录到名为ibdata1的系统表空间文件中。 如果发生崩溃,InnoDB将通过重播这些日志进行自动回复。

FULLTEXT索引

在MySQL 5.6.4版之前,InnoDB不支持FULLTEXT索引。 在写这篇文章时,许多共享主机提供商的MySQL版本仍低于5.6.4,这意味着InnoDB表不支持FULLTEXT索引。

但是,这不是使用MyISAM的有效理由。 最好转换为支持最新版本MySQL的主机提供商。 不是说使用FULLTEXT索引的MyISAM表不能转换为InnoDB表。

结论

总之,InnoDB应该是您选择的默认存储引擎。 满足特定需求时选择MyISAM或其他数据类型。





myisam