MYSQL

Portal Home > Knowledgebase > Articles Database > MYSQL

Posted by stevejordan, 11-14-2012, 01:10 PM
How many rows can MYSQL handle? If a mySQL database is maxxed out, how is it scaled?
Posted by SajanP, 11-14-2012, 01:18 PM
There's no real row limit, but there is a size limit. Here's a good link at the docs for info about various limits with InnoDB. http://dev.mysql.com/doc/refman/5.5/...trictions.html If you get to a point where you're storing a ridiculously large database. You might want to look at partitioning.
Posted by MikeZavatta, 11-14-2012, 05:50 PM
It can be very difficult to provide scaling suggestions without knowing what type of information is being stored. There are many implementations and they are all different. There is no absolute rule on how to scale. However, with that being said you may want to look at sharding. Here is a basic overview of how this works. For some practical examples on how MySQL can be scaled you may want to check out this blog post on how Twitter stores all their tweets using MySQL. For an overview of how a lot of different companies scale you can check out this blog post.
Posted by File1eu, 11-15-2012, 05:13 AM
Mike_GoDaddy: thanks, those are some interesting reads. I've worked with MySQL databases that have about 10 million rows and most queries can be executed in less than 0.00 seconds, as long as the where query uses indexed columns and disk fragementation is not extremely high. This was on desktop hardware btw, so on a fast server optimized SQL queries will be fine as long as you don't become the next facebook or twitter.
Posted by MattF, 11-15-2012, 05:23 AM
How can you get less than 0.00 seconds? Generally fast queries are those served from memory rather than disk , and this can happen at numerous layers. If query cache is enabled then if the hash of the query has already been executed (and not invalidated by updates) then this can be served with very little overhead, this has problems those as if you something retarded like Magneto where they use a silly precision to determine what to show based then this will miss the query cache, the granularity of invalidate is also extremely weak last I looked. Then you have the buffer, so in these instances the portions of the data have already been deserialized and in memory so the the query can execute very quickly. Then finally you have file cache, which will transparently serve read calls from memory if linux has it cached (not sure if mysql circumvents this with O_DIRECT etc..), otherwise and painfully it goes to disk, and it doesnt matter weather you have a 1995 160mb hard drive or a new Vertex 4 SSD, it is still a magntitude slower than memory. As a rule thumb keep GET requests entirely served from memory. In terms of scaling, vertical scaling hits limits very quickly and capacity/cost bends against you pretty sharpish (i.e. you enter the poncy world of "enterprise" hardware - over priced, black box crap), most use of mysql at scale will be using horizontal scaling, usually partitioning the data somehow or replicating it across numberous read slaves. Im pretty sure highscalability has some interesting articles on mysql scaling.. Last edited by MattF; 11-15-2012 at 05:27 AM.
Posted by stevejordan, 11-15-2012, 07:34 AM
Here are the my.cnf settings we are using, we found these to be fairly good. Can anyone recommend any improvements on this? [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 # To allow mysqld to connect to a MySQL Cluster management daemon, uncomment # these lines and adjust the connectstring as needed. #ndbcluster #ndb-connectstring="nodeid=4;host=localhost:1186" query_cache_type=1 query_cache_size=50M query-cache-limit=2M [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid [ndbd] # If you are running a MySQL Cluster storage daemon (ndbd) on this machine, # adjust its connection to the management daemon here. # Note: ndbd init script requires this to include nodeid! connect-string="nodeid=2;host=localhost:1186" [ndb_mgm] # connection string for MySQL Cluster management tool connect-string="host=localhost:1186"
Posted by File1eu, 11-15-2012, 11:45 AM
Because it's probably around 0.003 seconds, so it's rounded to 0.00. But you're right, these queries were mostly cached. Still millions of rows should be no problem for MySQL.
Posted by JayWard_HSW, 11-15-2012, 12:44 PM
Oh, Steve, you're running MySQL Cluster? That's a different animal entirely. In that case, there are quite a few optimizations you can and arguably should do. When we started our implementation of cluster, I found the Several Nines' Cluster Configurator (I can't post a link cause I just joined here, but Google knows what I am talking about) to be an invaluable resource and excellent starting place. They walk you through a fairly basic, yet rather optimized generic installation of cluster. Once you have that up and going, scaling is done by adding nodes. You have to re-partition the tables after adding data node groups, which is basically the sharding Mike talked about, but it is doable. Also, MySQL Cluster presents a whole slew of additional issues and concerns. With this latest release (7.2.*), they have really done a great job, but it's still it's own animal. We had to redesign most of our tables to be able to run effectively on cluster. You are definitely going to want to test it thoroughly before putting into production. (That should go without saying with any new technology, but ESPECIALLY with cluster). The methods of administrating a cluster are much different than those of running one machine, so it is going to take some getting used to.
Posted by MikeZavatta, 11-15-2012, 02:57 PM
You are welcome! I am glad you found them as interesting as I did!
Posted by alyak, 11-16-2012, 06:05 AM
you mean there can't be negative time ? less 0.00 mean negative simply query on 10 mln records , with short table structure , that meet exacly index , will work as he said .
Posted by syleishere, 11-17-2012, 05:52 AM
10 million with a SSD for mainly reads myisam format be the quickest, going over that, alot of people just split on a field to different databases, like a-c_table, d-g_table etc, depends how far you want to optimize, smaller the table, less time takes to get a query back. To optimize even further on alot of requests, people can load balance on reads between a master and slave, but obviously only write to the master so the slave gets the updates and it doesn't work the other way around.

Add to Favourites Print this Article