Home News Feeds Planet MySQL
Planet MySQL
Planet MySQL -

  • Log Buffer #402, A Carnival of the Vanities for DBAs
    This Log Buffer edition hits the ball out of park by smashing yet another record of surfacing with a unique collection of blog posts from various database technologies. Enjoy!!! Oracle: EM12c and the Optimizer Statistics Console. SUCCESS and FAILURE Columns in DBA_STMT_AUDIT_OPTS. OBIEE and ODI on Hadoop : Next-Generation Initiatives To Improve Hive Performance. Oracle Bundle Patching. Performance Issues with the Sequence NEXTVAL Call. SQL Server: GUIDs GUIDs everywhere, but how is my data unique? Questions About T-SQL Transaction Isolation Levels You Were Too Shy to Ask. Introduction to Advanced Transact SQL Stairway and Using the CROSS JOIN Operator. Introduction to Azure SQL Database Scalability. What To Do When the Import and Export Wizard Fails. MySQL: Orchestrator 1.2.9 GA released. Making HAProxy 1.5 replication lag aware in MySQL. Monitor MySQL Performance Interactively With VividCortex. InnoDB’s multi-versioning handling can be Achilles’ heel. Memory summary tables in Performance Schema in MySQL 5.7.

  • Store UUID in an optimized way
    A few years ago Peter Zaitsev, in a post titled “To UUID or not to UUID,” wrote: “There is timestamp based part in UUID which has similar properties to auto_increment and which could be used to have values generated at same point in time physically local in BTREE index.”For this post I’ve rearranged the timestamp part of UUID (Universal Unique Identifier) and did some benchmarks.Many people store UUID as char (36) and use as row identity value (PRIMARY KEY) because it is unique across every table, every database and every server and allow easy merging of records from different databases. But here comes the problem, using it as PRIMARY KEY causes the problems described below.Problems with UUIDUUID has 36 characters which makes it bulky.InnoDB stores data in the PRIMARY KEY order and all the secondary keys also contain PRIMARY KEY. So having UUID as PRIMARY KEY makes the index bigger which can not be fit into the memoryInserts are random and the data is scattered.Despite the problems with UUID, people still prefer it because it is UNIQUE across every table, can be generated anywhere. In this blog, I will explain how to store UUID in an efficient way by re-arranging timestamp part of UUID.Structure of UUIDMySQL uses UUID version 1 which is a 128-bit number represented by a utf8 string of five hexadecimal numbersThe first three numbers are generated from a timestamp.The fourth number preserves temporal uniqueness in case the timestamp value loses monotonicity (for example, due to daylight saving time).The fifth number is an IEEE 802 node number that provides spatial uniqueness. A random number is substituted if the latter is not available (for example, because the host computer has no Ethernet card, or we do not know how to find the hardware address of an interface on your operating system). In this case, spatial uniqueness cannot be guaranteed. Nevertheless, a collision should have very low probability.The timestamp is mapped as follows: When the timestamp has the (60 bit) hexadecimal value: 1d8eebc58e0a7d7. The following parts of the UUID are set:: 58e0a7d7-eebc-11d8-9669-0800200c9a66. The 1 before the most significant digits (in 11d8) of the timestamp indicates the UUID version, for time-based UUIDs this is 1.Fourth and Fifth parts would be mostly constant if it is generated from a single server. First three numbers are based on timestamp, so they will be monotonically increasing. Lets rearrange the total sequence making the UUID closer to sequential. This makes the inserts and recent data look up faster. Dashes (‘-‘) make no sense, so lets remove them. 58e0a7d7-eebc-11d8-9669-0800200c9a66 => 11d8eebc58e0a7d796690800200c9a66BenchmarkingI created created three tablesevents_uuid – UUID binary(16) PRIMARY KEYevents_int – Additional BIGINT auto increment column and made it as primary key and index on UUID columnevents_uuid_ordered – Rearranged UUID binary(16) as PRIMARY KEYI created three stored procedures which insert 25K random rows at a time into the respective tables. There are three more stored procedures which call the random insert-stored procedures in a loop and also calculate the time taken to insert 25K rows and data and index size after each loop. Totally I have inserted 25M records.Data Size Horizontal Axis – Number of inserts x 25,000 Vertical Axis – Data Size in MB The data size for UUID table is more than other two tables.Index Size Horizontal axis – Number of inserts x 25,000 Vertical axis – Index Size in MB Total Size Horizontal Axis – Number of inserts x 25,000 Vertical Axis – Total Size in MB Time taken Horizontal axis – Number of inserts x 25,000 Vertical axis – Time Taken in seconds For the table with UUID as PRIMARY KEY, you can notice that as the table grows big, the time taken to insert rows is increasing almost linearly. Whereas for other tables, the time taken is almost constant.The size of UUID table is almost 50% bigger than Ordered UUID table and 30% bigger than table with BIGINT as PRIMARY KEY. Comparing the Ordered UUID table BIGINT table, the time taken to insert rows and the size are almost same. But they may vary slightly based on the index structure.root@localhost:~# ls -lhtr /media/data/test/ | grep ibd -rw-rw---- 1 mysql mysql  13G Jul 24 15:53 events_uuid_ordered.ibd -rw-rw---- 1 mysql mysql  20G Jul 25 02:27 events_uuid.ibd -rw-rw---- 1 mysql mysql  15G Jul 25 07:59 events_int.ibdTable Structure#1 events_int CREATE TABLE `events_int` (  `count` bigint(20) NOT NULL AUTO_INCREMENT,  `id` binary(16) NOT NULL,  `unit_id` binary(16) DEFAULT NULL,  `event` int(11) DEFAULT NULL,  `ref_url` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,  `campaign_id` binary(16) COLLATE utf8_unicode_ci DEFAULT '',  `unique_id` binary(16) COLLATE utf8_unicode_ci DEFAULT NULL,  `user_agent` varchar(100) COLLATE utf8_unicode_ci DEFAULT NULL,  `city` varchar(80) COLLATE utf8_unicode_ci DEFAULT NULL,  `country` varchar(80) COLLATE utf8_unicode_ci DEFAULT NULL,  `demand_partner_id` binary(16) DEFAULT NULL,  `publisher_id` binary(16) DEFAULT NULL,  `site_id` binary(16) DEFAULT NULL,  `page_id` binary(16) DEFAULT NULL,  `action_at` datetime DEFAULT NULL,  `impression` smallint(6) DEFAULT NULL,  `click` smallint(6) DEFAULT NULL,  `sold_impression` smallint(6) DEFAULT NULL,  `price` decimal(15,7) DEFAULT '0.0000000',  `actioned_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',  `unique_ads` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,  `notification_url` text COLLATE utf8_unicode_ci,  PRIMARY KEY (`count`),  KEY `id` (`id`),  KEY `index_events_on_actioned_at` (`actioned_at`),  KEY `index_events_unit_demand_partner` (`unit_id`,`demand_partner_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; #2 events_uuid CREATE TABLE `events_uuid` (  `id` binary(16) NOT NULL,  `unit_id` binary(16) DEFAULT NULL, ~ ~ PRIMARY KEY (`id`),  KEY `index_events_on_actioned_at` (`actioned_at`),  KEY `index_events_unit_demand_partner` (`unit_id`,`demand_partner_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; #3 events_uuid_ordered CREATE TABLE `events_uuid_ordered` (   `id` binary(16) NOT NULL,   `unit_id` binary(16) DEFAULT NULL, ~ ~ PRIMARY KEY (`id`),   KEY `index_events_on_actioned_at` (`actioned_at`),   KEY `index_events_unit_demand_partner` (`unit_id`,`demand_partner_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;Conclusions Create function to rearrange UUID fields and use itDELIMITER // CREATE DEFINER=`root`@`localhost` FUNCTION `ordered_uuid`(uuid BINARY(36)) RETURNS binary(16) DETERMINISTIC RETURN UNHEX(CONCAT(SUBSTR(uuid, 15, 4),SUBSTR(uuid, 10, 4),SUBSTR(uuid, 1, 8),SUBSTR(uuid, 20, 4),SUBSTR(uuid, 25))); // DELIMITER ;InsertsINSERT INTO events_uuid_ordered VALUES (ordered_uuid(uuid()),'1','M',....);SelectsSELECT HEX(uuid),is_active,... FROM events_uuid_ordered ;Define UUID as binary(16) as binary does not have any character set References The post Store UUID in an optimized way appeared first on MySQL Performance Blog.

  • MySQL Debian/Ubuntu packaging sprint
    Debian/Ubuntu packaging sprint participants. From left: James Page, Norvald H. Ryeng, George Lorch, Akhil Mohan, Otto Kekäläinen, Robie Basak. Last week, Canonical invited the MySQL packaging team in Debian to a packaging sprint in their London office, and most of us were able to participate. We’ve met online on IRC and UOSs before, but this was the first time we were all in the same room. The results of our sprint will soon be available in a .deb near you. Since Debian Jessie is currently in feature freeze, most of it will hit Ubuntu first. The two main things we achieved on the MySQL side were to make MySQL 5.6 ready for Ubuntu Vivid (15.04) and to split MySQL, Percona and MariaDB configuration files. The configuration file split reduces coupling between MySQL and variant packages, and it is key to solving most of the challenges the team faces. We can now do packaging work on one variant without coordinating with all the others. That is a major achievement! In addition to this, we also made smaller improvements: parallel builds and MTR test runs, defined and implemented a test plan in order to run the right tests at the right time (build vs. DEP8), and cleanup of the debian/rules file. Not all of this is visible to the end user, but to the package maintainers it’s a big deal. Time flies when you’re having fun, so when the week was over, we still had more we’d like to do with the packages. The work will continue, but at least most of the heavy lifting is now done. I don’t think we ever would have got this far without meeting in person. Let me end by extending my thanks to Canonical for organizing and hosting this sprint! It was really useful and fun! I think we all achieved and learned a lot. So once again, to Canonical and everybody that participated, thank you, and merry Christmas!

  • Open Source Collaboration: This is how we did it ‘together’ !
    It was not long before when we all were discussing to meet in person during UDS. We did not have good enough reasons to get the logistics mobilized back then but over time we realized the vibrant MySQL Ecosystem on Debian and Ubuntu needs a brainstorming session. While we did try Google Hangout as the […]

  • A Year In The Life Of MaxScale
    Fri, 2014-12-19 09:14mriddoch  This time of the year it is traditional, at least in the UK, to look back and reflect on the year that is coming to a close. Since we have just produced the release candidate for MaxScale and are looking forward to the GA release early in the New Year, it seems like a good time to reflect on the events that have bought us to this stage in the story of MaxScale. Going Public The start of 2014 also marked the start for MaxScale, with the first public announcements regarding MaxScale and the first downloadable binaries. MaxScale itself had been started internally before that, but we wanted to hold off on letting it out into "the wild" until there was enough of the functionality provided in order to be able to do more than just give "what it might be" type promises. At first we only had tar files available and only for CentOS/RedHat Linux distributions, we also have the source code available in GitHub for the first time. That first version of MaxScale contain the fundamental modules it needed to demonstrate the functionality we wanted to show, namely the MySQL protocol modules that allowed connections using the MySQL client protocol and for MaxScale to be able to connect with the backend MariaDB and MySQL databases. It also contained the first iterations of monitors that allowed us to monitor the basic state of a Galera cluster or a Master/Slave replication setup using MySQL replication. The two classes of router that MaxScale supports were represented by the read connection router and an early version of the read/write splitter router. This early version included many restrictions on SQL that would and would not work with the splitter. In the first few months of the year we worked on improving the functionality of these modules and talking about MaxScale at events such as MySQL meet up groups, FOSDEM and Percona Live. We also created the MaxScale Google Group as a way to allow users to communicate with us and for the MaxScale team to announce improvements and bug fixes within MaxScale. We also launched the public bug site, We had always planned MaxScale with the five plugin classes, protocols, monitors, routers, filters and  authentication; at this time we had two plugin classes missing completely, filters and authentication. The filters we could do without for the time, but we needed some form of authentication, so we built that into the protocol module as a short term expediency. We built what we called "transparent authentication", we loaded the user data from the backend database at startup and used this to authenticate the users that logged into MaxScale. We had to do this since it is key to the design of MaxScale that we have the ability to create multiple connections from MaxScale to the backend databases and to be able to create those at different times in the lifecycle of the client session. This meant we could not simply proxy the package exchange for the login sequence, we had to do more than that. While this is a vital part of the design, it does lead to some challenges and is perhaps one area in which the users need to be aware of MaxScale when creating authentication configurations within the underlying database cluster, a considerable amount of time has been invested in answering questions in this area and providing documentation, however it still remains as one of those areas that needs more time begin invested in to explain fully how to setup the authentication models that are required. Improved Galera Support We were considering better support for Galera clusters, in particular we wanted to do something that would allow us to segregate the write operations in order to remove write conflicts. We considered writing a Galera specific router that would send all writes to a single node whilst distributing the reads across all nodes. It had been my plan to use this as a tutorial on the subject of writing your own router. It was while doing this that I realised this was not really necessary, since our existing read/write splitter could do this task for us, it just needed a way to determine which node should receive the writes. The current Read/Write splitter used the Master bit in the server status to determine which database should receive the writes, therefore all that was needed was for the Galera monitor to select one of the  running nodes to be the nominal master and set the master bit for that node. We wanted the selection to be predictable, so that if two MaxScales were front ending the same Galera Cluster they would both choose the same node as the master. We used the fact that Galera allocates each node an ID to then select the node with the lowest ID as the master. The result was that we had a solution that provided a Galera cluster as a high availability solution with read scale out for MySQL/MariaDB. Failover was very quick, since there was no database activity to do or binlog manipulation as there would be with a MySQL replication cluster. What this solution does not give you is any write scale out ability. Users and Surprises It had always been my hope that users would find MaxScale useful and want to extend the MaxScale functionality to fit the particular needs they have, this is part of the reason or the plugin nature of MaxScale and the division of responsibilities between those plugins. However it was somewhat surprising to be contacted with the idea of using MaxScale not as a proxy between the database clients and the databases but as a proxy between the master and slaves in a MySQL replication cluster. I have to admit that my first reaction was that this was really not something MaxScale was designed to do, but upon meeting with the people proposing this I was convinced not just that it was an extremely good idea, but also that it was something that we could fit into the MaxScale architecture without having to make changes to the core or jeopardising the original MaxScale concept or design. So began a period of working closely with to produce a specialist router module for MaxScale that allows MaxScale to act as an intermediate master within a replication tree. It was particularly nice to be able to have such an unexpected use case presented to us and find that MaxScale was flexible enough to be able to facilitate it. It was also nice to then find such a well respected member of the user community publicly talk about this use of our work and even present it at conferences. We have also had a number of other requests from users through the year, these have resulted in either the modification of existing module to fit better in a given environment or the creation of completely new modules. This has included a number of different monitor modules to cover more database deployment architectures; MMM multi-master and MySQL NDB Cluster being two cases of this. We also produced a hinting mechanism at the request of a user such that the original SQL text could include hints as to the destination to which statements should be sent. We have also come across other users that wanted additional functionality who have written or are writing plugin modules of their own. This again has been a vindication of our original concept of the pluggable architecture and the flexibility that we had hoped to achieve. Filters & Other New Features Throughout the year we have also added many new features to MaxScale, the biggest of these probably being the introduction of the filter API and a set of "example" filters for various tasks. This has provided us with some simple logging mechanisms, query rewriting and query duplication functionality. Other features that have been added include the ability to not just monitor a single level master/slave implementation, but a fully hierarchical replication cluster. The concept of weighting in the routing decision has been added; this provides a way to allow MaxScale to manage set of servers that have dissimilar configurations or to segregate load between servers within a cluster. A client application has been added to allow for some management and monitoring of MaxScale to be undertaken. The read/write splitter has been enhanced to remove not just the limitations that existed when it was first launched, but also to add new facilities to control the routing or support constructs not previously available. Authentication has also undergone some change, with the addition of a mechanism to allow MaxScale to track new or changed users within the database and the ability to wildcard match IP addresses in the MySQL user table. Also support has been added for users that have been given access to only a subject of databases. The MySQL Replication cluster monitor has also been to measure slave lag within the cluster. This slave lag is then made available to the routing modules so that the Read/Write splitter can be configure to disregard slaves that are more than a certain amount behind the master, or always choose to see read operations to the slave that is most up to date. As well as new features and enhancements we have also spent a lot of time fixing bugs, writing test cases and generally trying to improve MaxScale and the MaxScale documentation. Packaging is another area that has benefited from more attention during the year, with RPM and Debian packages being introduced, allowing for much easier installation of MaxScale and also reflecting support for more Linux distributions. Beta Testing & Towards First GA The last few months has seen the focus of the MaxScale team to be more related to the upcoming GA release, with new features playing less of a role in the development. That is not to say there has not been ay new features, but these have been targeted for release after the first GA release. This allows us to concentrate on stabilising the features in that GA release and have them more fully tested without preventing new features moving forwards. We also engaged with a number of users that expressed an interest in becoming beta testers for the MaxScale GA release. This, together with the internal testing which we have been performing is helping improve the quality of the MaxScale core and those modules that will be included in the first GA release. Thank You MaxScale of course is not a "one man band" and I would like to thank the developers that have been working with me over the year; Massimiliano and Vilho, plus those that have joined the team during the later part of the year; Markus, Martin & Timofey. They have all put an enormous amount of effort in to bring MaxScale to the brink of GA release. I should also not forget Ivan Zoratti with whom a lot of the early ideas for MaxScale have been developed, although he is no longer working with the team his influence is still felt in a lot of areas. The other group I would like to thank are the users that have given the their time to try MaxScale, give us feedback and encouragement that what we are working is useful and has a place in the database infrastructure. It has also been very gratifying to get the personal feedback at events and to see others outside of the group start to talk about MaxScale and share some of the same thoughts we have had within the group as to what MaxScale can do within a database deployment - even if some of those have surprised us. The MaxScale Release Candidate packages are avialable for download from Tags: MaxScale



Copyright © 2001 - 2013 K-Factor Technologies, Inc.

Site Meter

a href=