Tuesday, December 31, 2013

Fun with Bugs #28 - regression bugs in MySQL 5.6

2013 was a great year for MySQL Community. New MySQL 5.6 GA release with its increased throughput, scalability and new features as well as more interaction and cooperation with MySQL Community from Oracle side brought us a lot of new perspectives and good feelings over the year.

Unfortunately new MySQL 5.6 GA release also reminded about old and well known problem with new MySQL versions. They all introduce new regression bugs. MySQL 5.6 had not become an exception.

Note that according to good old tradition (that I hope will be followed in 2014) bugs that demonstrate a regression (make some feature that previously worked stop functioning as intended in a new release) are marked with "regression" tag at http://bugs.mysql.com. So, it's easy to find them, and here is the list of regression bugs that affect MySQL 5.6 (sometimes regression bug happens in several major versions, as fixes also happens in several versions).

You'll see just 31 bugs in the list of active ones affecting 5.6. Do not become very optimistic because of this. Unfortunately good tradition to use "regression" tag is not followed by some Oracle engineers , so some bugs clearly demonstrating a regression are not tagged properly. Check well known Bug #69623, "since 5.5.32 & 5.6.12, innodb cant start with own multi-file tablespace", for example. The fact that it's a regression is clear from the phrase above even (synopsis), but still I do not see a "regression" tag.

So, regressions still happens (and sometimes it's hard to note/find them before you hit them), even though we all know that "Automated testing and well-written test cases can reduce the likelihood of a regression.". It seems we have reasons to suspect that either test cases development, or proper automated testing (or both) is still far from perfect in Oracle (and maybe only a bit better in other companies that provide MySQL builds or forks).




To give you some ideas of what kind of regression bugs were noted in MySQL 5.6 in 2013, let me recent 10 verified regression bugs:
  • Bug #71244, "Wrong result computation using ALL() and GROUP BY". Older versions, like 5.6.11, do not have this problem. It can be easily workarounded by NOT using alias of aggregated column in HAVING clause, but still it's sad to see that this simple enough kind of query is not covered by regression tests.
  • Bug #71220, "pow() function returns an error for bad values". It's a representative of an interesting class of "regression"-like bugs. New versions change something in good old MySQL behavior, to make it "better", more close to SQL Standard or "improve" it. But this improvement affects users who upgrade, and manual does not help them to be better prepared for this kind of "improvement", as this change is not listed as incompatible in any prominent place. Just a change to better that ends up as a regression from user's point of view.
  • Bug #71095, "Wrong results with PARTITION BY LIST COLUMNS()". I've reported it based on customer issue after we found a workaround (different way to partition a table). Still, it was a very bad surprise (as most regression bugs) and it took notable efforts to reduce the problem to a simple test case. Bug got immediate attention and simple patch from Mattias Jonsson (patch solved the problem according to my test), so I hope to see the fix in 5.6.16.
  • Bug #71055, "Using IF EXISTS(SELECT * ...) acquires a lock when using read uncommitted". READ UNCOMMITTED isolation level had never been widely used and thus one should not expect it to be well tested, but on the other hand locking SELECT is even less expected with this level. It seems READ UNCOMMITTED is not covered by regression tests properly, otherwise Oracle could not miss this regression/change of behavior comparing to MySQL 5.1.x.
  • Bug #70819, "SHOW ENGINE INNODB MUTEX does NOT work with timed_mutex properly". I've noted this undocumented change in behavior of 5.6 vs 5.5 while working on my PERFORMANCE_SCHEMA-related presentation. It's clear that in MySQL 5.7+ PERFORMANCE_SCHEMA will probably replace not only SHOW PROFILE, but also some other SHOW statements. But while functionality is there officially (according to the manual), it should not just disappear, whatever good intention one may have or whatever corner case we speak about.
  • Bug #70617, "Default persistent stats can cause unexpected long query times". This is a good example of visible performance regression related to a new feature and lack of details on how it works or how to use it properly. For poor users it's just like MySQL 5.6 started to work really slow on some queries in some cases. Then it take many hours (if not days) for several people to really find out what's wrong, fix real life case with a workaround and then reduce it to something that can be accepted as a verified bug. I had entire post about this specific case. It has a lot of details about the process of forcing Oracle to consider this case seriously...
  • Bug #70598, "Premature expression evaluation prevents short-circuited conditionals". One more case when code that just worked in MySQL 5.1 stopped working after upgrade to 5.5, and none of the never versions had not helped. Workaround exists, but if user has to change the code just because of upgrade, and had no way to find out that is is going to be needed before he hit the bug, it's already bad.
  • Bug #70491, "SELECT DISTINCT may return a wrong result if a join buffer is involved". One more case of the optimizer-related regressions, and ones that could be found by proper QA. It was reported by Igor Babaev from MariaDB. As a side note, too many bugs in 5.6 are related to all cases and kinds of statements involving aggregation. Something to think about.
  • Bug #70466, "No results when filesorting with a correlation in subquery's HAVING clause". Again HAVING and regression in MySQL 5.6. Oracle started to change optimizer in MySQL 5.6, based on old MySQL 6.0 ideas and some new ones, but it's clear that new developments in this area are NOT properly supported by regression tests.
  • Bug #70351, "ALTER TABLE ADD CONSTRAINT xxx FOREIGN KEY adds two constraints". Sometimes fixes/changes happen in all major MySQL versions. In cases like these one has to care a lot to test properly, as change or related regression may affect wide user base. This bug is just an example of a change that led to regression in all versions after just a minor upgrade.
I should stop at this stage, as it's time for a New Year wishes. I with all Oracle MySQL engineers to care about regression testing properly!

And to Oracle MySQL managers I wish to recognize the efforts of their colleagues who process bugs from community, and to award those who really work and care. As a hint, note who worked on the bugs mentioned above, who verified most of them (5 out of 10), who added more test cases and who contributed patches promptly. These are your most valuable assets, and I hope you will care about them properly!

Saturday, December 28, 2013

Fun with Bugs #27 - bug reports from my teammates at Percona

Surely, I am not the only one in Percona who reports MySQL bugs. In my old post, "17 Famous MySQL Bug Reporters", I've already mentioned Roel Van de PaarAlexey Kopytov and Peter Zaitsev.They had contributed a lot over years.

In this post I'd like to concentrate on bug reports from my Support colleagues at Percona. Many of their contributions are notably more important than anything I've ever reported. Many bugs they reported are fixed. Oracle recently started to recognize in public Community contributions in a form of the bug reports, so you had a chance to see some of the names mentioned below, with explicit thanks to them. Still, I think it makes sense to repeat them again.

So, here is a list of my colleagues who used to work in Percona Support team in 2013 and report MySQL bugs, with total number of bug reports from them as of today and 1-2 reports highlighted as most important ones. You can click on the name to see the list of all bugs reported by each person, started from the recent ones:
  • Jaime Sicam had reported only one MySQL Workbench bug this year (that was fixed in 6.0.1). But his older bug report, Bug #64922, "Foreign Key Error on CREATE TABLE after ALTER TABLE and DROP TABLE statements.", that I verified in 2012 while working in Oracle, is still waiting for some attention (even though it may be already fixed in 5.6.6+).
  • Jervin Real had reported 11 bugs in total, 8 of them in 2013. Check his Bug #70404, "Bit Value Not being Dumped Properly by mysqldump". BIT data type causes many small but annoying problems in MySQL, and inability to dump values properly is one of them. Use TINYINT or CHAR(1) instead, really...
  • Justin Swanhart is already famous as bug reporter based on his contributions to MySQL 5.6. He had reported 21 bugs in 2013. Oldest of his reports, Bug #68607, "REPLACE statement not properly logged in binary log in RBR", is still just "Verified" though.
  • Miguel Angel Nieto was also recognized as a valuable contributor by Oracle, because of Bug #69861 he had reported this year (fixed in 5.6.15). His old documentation request, Bug #63128, "explanation of the behavior of innodb_autoinc_lock_mode = 1 with INSERT IGNORE", is still waiting for some attention.
  • Muhammad Irfan had reported Bug #70537, "No users created under MySQL system database for RPM based installation", that ended up as a verified documentation request.
  • Ovais Tariq had reported 11 bugs in total, of them 5 in 2013. Most of his recent reports are still waiting for the fix, and they are pretty serious. Check Bug #69680, "Auto_inc value not properly generated with RBR and auto_inc column only on slave", for example.
  • Paul Namuag - had reported Bug #71188, "Strange beheavior ON DUPLICATE KEY UPDATE when auto_increment reaches MAXINT", recently. It was declared "Not a bug", but I think there is still something to fix. Hence my Bug #71232, "Wrong behaviour for auto_increment unsigned bigint column approaching max value ". 
  • Przemyslaw Malkowski had reported 6 bugs this year, and 4 of them are still waiting for the fixes. Check his recent finding, Bug #71211, "ARCHIVE engine does not guarantee UNIQUE and PRIMARY KEY constraints".
  • Raghavendra Prabhu  had reported Bug #69969, "Failing assertion: prebuilt->trx->conc_state == 1 from subselect", this year. Still waiting for the fix. Note that this bug was found as a result of his ongoing QA efforts using RQG. It seems Raghu has more than one account in the bugs database, so here is the list of 5 more bugs he had reported (3 of them in 2013).
Even though he does not work in Support formally, but Laurynas Biveinis must be mentioned in any good list of MySQL contributors. With his 54 bug reports in total (many of them with patches), of them 27 reported in 2013, he is one of the key Community contributors recently. Still, some of his reports, like Bug #68725, "UNIV_MEM_DEBUG needlessly slow, especially with UNIV_ZIP_DEBUG", are waiting for formal processing, not even the fix...

Anyway, as you can see, Percona employees contribute not only to Percona software users, but also to upstream MySQL users. We report bugs, so we care!

Fun with Bugs #26 - MySQL bugs Oracle had not fixed for me (yet)

In the previous post in this series I've listed 15 MySQL bug reports, documentation and feature requests I've made in 2013 that got fixes or any other kind of solution. Now it's time to check what happened to the rest and try to think why.

First of all, no MySQL bug reporter is perfect (if only Domas), so some bug reports may be false alarms ("Not a bug"), to hard to fix at any foreseeable future ("To be fixed later") or asking for something that Oracle does not plan to provide at all ("Won't fix" or "Unsupported"). Some of my bug reports this year felt into these categories:
  • Bug #71205 - "Queries to P_S seems to pass extra stages related to query cache". This is "Not a bug", as it seems access to query cache happens even before SELECT is parsed enough to find out it accesses table(s) in PERFORMANCE_SCHEMA. So, if you want to avoid extra overhead in general case make sure you start all your queries to PERFORMANCE_SCHEMA with SELECT SQL_NO_CACHE ... I'd prefer to see this explicitly mentioned in the manual, but maybe it's only me.   
  • Bug #71170 - "Please, make MySQL RPMs relocatable". This one was set to "Unsupported", so Oracle does not plan to provide relocatable packages and users who want to test/use multiple versions of MySQL server should either rely on chroot environments, .tar.gz packages and sandboxes or wait for other MySQL providers to make their RPMs relocatable (if ever). Fair enough.
  • Bug #71041 - "Please, document every instrument in P_S.setup_instruments in details". Here I see "To be fixed later", so it seems documentation team has more important things to do than to document all the details of PERFORMANCE_SCHEMA instrumentation. This is sad as it's hard to use instrumentation now without knowing all ins and outs of the code. I am also not satisfied with official reasons to have this undocumented listed in the manual.
  • Bug #69399 - "Inconsistency in crash report". It's "Won't fix" and for a really good reason explained by Shane. Signal handler must use approved safe functions or risk causing a crash itself. time() is safe but returns UTC. So, you should expect that for crashes, assertion failures and outputs caused by other signals you get timestamps in UTC, no matter what your timezone is. So, this was my fault.
Some bug reports are still not processed ("Open" or "Analyzing"). That's expected for complex reports, those without clearly repeatable test case (I try to NOT send reports like this) or related to not clearly documented features of MySQL server. But I have one feature request that is probably just missed. It's the Bug #70196 - "DISCARD/IMPORT tablespace is not supported for partitioned InnoDB tables". There is no way to IMPORT just one partition, or, for that matter, to backup or restore from backup only one partition, even when it is stored in the individual .ibd. file and you use advanced tools like MySQL Enterprise Backup for backups. This is unfortunate that nobody explicitly cares about this.

The rest of my reports sent this year were accepted as valid and has "Verified" status. One can hardly expect for 20+ valid MySQL server or documentation bugs he reported to be fixed over a year, so this situation is not a problem. But I still would like to remind about a couple of reports that are verified long time ago and still do not get any visible attention or fixes:
  • Bug #68097 - "Manual does not explain that some P_S instruments must be enabled at startup". it was reported on January 16, 2013 and it seems already clear what instruments (mutexes actually) must be enabled upon startup (and then can be disabled and enabled back dynamically at runtime), by design. Slide 35 of my presentation for Percona Live London 2013 quotes the explanation made by famous Oracle's performance expert, Dimitri Kravtchuk, in his blog. But the documentation request is still just "Verified".
  • Bug #69574 - "Slave crashes when applying row-based binlog entries in cascading replication...". It affects MySQL 5.6, was verified 6 months ago, sounds serious enough and still had not got any single comment after verification.
For other bug reports I agree to wait patiently... Just note that while I like to play with PERFORMANCE_SCHEMA or find some minor problems in the manual just for fun, most of my bug reports for InnoDB, Optimizer, Partitioning and Replication categories are based on real life issues noted by Percona customers. They were not really funny to hit in production and caused notable problems.

Friday, December 27, 2013

Fun with Bugs #25 - MySQL bugs Oracle fixed for me this year

I've checked recently and noted that I've sent 50 reports about MySQL bugs, features I'd like to see and unclear/missing manual pages this year. It all started with famous Bug #68079 (reported on January 14, 2013), that got a lot of attention, valuable workaround from Oracle and caused a lot of work that is going to improve MySQL scalability substantially in the future.

Oracle had also implemented this my (and not only mine!) feature request, Bug #69527, and in MySQL 5.7.3 PERFORMANCE_SCHEMA finally exposes metadata locks information. This is a great and long waited step forward in instrumentation.

Besides that, 12 of my documentation requests were satisfied:
  • Bug #68089 - "Manual refers to wrong (old?) column names for some P_S tables"
  • Bug #68181 - "Release notes for 5.6.11 reference wrong bug number"
  • Bug #68223 - "Manual for ALTER TABLE is wrong about mixing partitioning and other changes"
  • Bug #69697 - "Manual has not enough details on how to use transportable tablespaces"
  • Bug #69701 - "ALTER TABLE ... IMPORT TABLESPACE does not check foreign keys"
  • Bug #69717 - "DML statements replicated via RBR are not logged in the general query log"
  • Bug #69865 - "Wrong default MESSAGE_TEXT values for SIGNALs are listed in the manual"
  • Bug #70682  - "The description for --version-check in the summary table is wrong"
  • Bug #70683 - "The description is wrong for --server-public-key-path=file_name"
  • Bug #70741 - "InnoDB background stats thread is not properly documented"
  • Bug #70991 - "Manual seems to recommend IDEMPOTENT mode for all cases of master-master"
  • Bug #71103 - "Wrong syntax is used as example in the manual"
Thank you, Jonathan Stephens and Paul Dubois, for your hard work to improve MySQL manual! I know, you had hard times with me more than once...

Also just yesterday one my feature request for http://bugs.mysql.com was finally implemented, Bug #70631, "8K limit for "How to repeat" filed is too restrictive"! I hit it several times during this year and it seems now we have a new limit, 32K. Good news!

Anything else? Actually, no. So, 15 out of 50 or so (I still have time to report few more). Of them one new feature in 5.7.3 9very important one) and one performance problem studied, with more work to do... Why is that so, maybe the rest of my reports were irrelevant, wrong or useless? You can check yourself or wait for the next issues of "Fun with Bugs", where I'll try to check other bugs I've reported in 2013.

Sunday, November 3, 2013

Teaser on my upcoming Percona Live London 2013 session

As you probably know already, I have a session on PERFORMANCE_SCHEMA at the conference, scheduled at 12 November 4:00pm - 4:50pm @ Orchard 1. Presentation is mostly ready, but I had not decided yet when to publish it. In the meantime, for those really interested, here is a teaser.

Below I list one link for each slide (in order of presentation) having more than one of them mentioned or listed in my notes. Now try to guess what I am going to say there and why. Note that it's not a tutorial (my half a day tutorial on PERFORMANCE_SCHEMA was not accepted by the conference committee, and this is probably good decisions, as I am usually very good in explaining what's bad or what should never be done and much worse in "best practices"). So, it's hardly a usual material on PERFORMANCE_SCHEMA...
  1. http://bugs.mysql.com/ - this is obvious, I am always speaking about bugs in any MySQL-related context.
  2. American Civil War - surely I plan to say something related to history
  3. Observer effect (physics) - ... and physic.
  4. How Percona does a MySQL Performance Audit - you always wanted to know this, aren't you?
  5. Science - yes, it's more science than art here, at Percona...
  6. http://www.juliandyke.com/Diagnostics/Events/EventReference.html - useful reference if you ever plan to move (back?) to good old Oracle RDBMS performance tuning
  7. SHOW INNODB STATUS walk through - I hope you've read this more than once. Maybe you had written something better on the topic? Then, please, share in the comment - I'd really want to read it one day.
  8. timed_mutexes system variable - do you know what is it for?
  9. http://dev.mysql.com/doc/refman/5.6/en/innodb-metrics-table.html - manual page for INFORMATION_SCHEMA.INNODB_METRICS table. So, I use references to the manual.
  10. http://dom.as/2013/04/17/on-performance-schemas/ - and I mention Domas and his posts more than once. That's good in any MySQL-related context.
  11. Bug #24795. This is one of many bugs mentioned. This time it's just a reminder about Jeremy Cole and what he did for MySQL many years ago.
  12. pt-pmp - as I work for Percona now, I surely have to mention Percona Toolkit (probably it's among my contract terms). This is not the only tool I mention.
  13. Bug #61545. Guess what this bug has to do with PERFORMANCE_SCHEMA in MySQL 5.6.
  14. Bug #69236. MySQL 5.6 is slow for single-threaded workloads. We all know this, so what?
  15. http://www.fromdual.ch/mysql-oprofile. Oli also has a session at PLUK 2013. It would be nice to meet him finally.
  16. http://dom.as/tag/dtrace/ - had you ever seen what Domas can do with DTrace on Mac OS X?
  17. http://marcalff.blogspot.com/ - not at the very beginning, but surely I mention a person who had actually implemented PERFORMANCE_SCHEMA...
  18. http://dev.mysql.com/doc/refman/5.6/en/performance-schema-instrument-naming.html - in some cases we just need a reference.
  19. http://mysqlentomologist.blogspot.com/2013/01/how-to-use-performanceschema-to-check.html - quoting myself, why not...
  20. Bug #68097. I've reported this bug, so I speak about it. Guess how many times.
  21. http://www.slideshare.net/Leithal/performance-schema-andpshelper - check this if you want to see some really good presentation on PERFORMANCE_SCHEMA. Mine has almost nothing in common with it.
  22. WL #2360. Do you know that some good old MySQL WorkLogs are still available in public?
  23. Why Performance Schema Overhead?.. - I have some ideas, but who can explain it better than DimitriK?
  24. Bug #69527. What, again a bug? This time it's a feature request and it's implemented in MySQL 5.7.3.
  25. Bug #68514. They say it's "Not a bug". Decide for yourself...
  26. Bug #68413. I wonder how many time I should mention this bug in public before it is closed somehow (not even fixed, just closed, somehow)...
  27. Bug #68855. It's a feature request, not directly related to PERFORMANCE_SCHEMA.
  28. Bug #68079. I've reported this bug and it surely shows how Oracle cares about MySQL.
 That's all. Now you know that there are least 28 slides in my presentation, that I am going to mention some names and speak a lot about bugs. If you want to check how all the above are related to real presentation, wait for the conference and attend my session (or read entire presentation soon). In any case it's now less than 10 days to wait.

If you want me to say (or NOT say) something special related to PERFORMANCE_SCHEMA, please, add a comment. I still have some time to add more slides or change my mind entirely...

Sunday, October 27, 2013

Recalculating InnoDB Persistent Statistics - a Story of the Bug Report

One of the first posts in this blog was about reporting MySQL bugs "properly", in a way that maximizes chances for it to be processed really soon. I had written the following there:
"Ideally, you should provide a complete test case and/or instructions that any reader can use to reproduce your problem"
Indeed, if one can just copy/paste something to mysql command line client or run some file attached to see the problem, chances are high for the bug to be processed really soon. We all like to get low hanging fruits from time to time, and Oracle engineers who work on bugs are not exceptions. But does this mean that bug without clear test case has no value and is going to be ignored?

It should NOT be the case. Let's review Bug #70617 reported by my colleague some time ago based on the problem we helped our customer to solve. There bug reporter just copied somewhat edited email with problem description we've got, and it was referring to a very complex SQL statement (joining 17 tables or so) that worked slow when executed from some PHP application on MySQL 5.6, but always worked fast when similar SQL was executed in mysql command line client and mostly worked fast in good old MySQL 5.5 in both cases.

The bug report contained this kind of problem description and listed workarounds we found: replacing all INNER JOINs with STRAIGHT_JOIN (trick that is useful more often than I'd like it to be) let us get not ideal, but consistent performance, while recreating the tables with persistent statistics disabled just let us get the same good performance as in 5.5 (take the fact that persistent statistics for InnoDB tables is enabled by default into account every time you see query from older versions performing poorly in MySQL 5.6, by the way).

I let you review that my old post (with the content that was once offered as a talk to MySQL User Conference back in 2006 and recently was submitted for Percona Live London conference, but again was not accepted) and decide for yourself is the bug report good enough to get proper attention. It seems some of my former colleagues in Oracle decided that it is not any good to report something like that, without a simple test case, and they were also offended enough by some statements made there and the fact that bug reporter explicitly does NOT want to spend time on creating small test case not related to customer data... So, bug report was ignored, while I've got few chances to try to defend bug reporter in private chats.

Assuming that without a test case this bug report is going to be ignored longer than I'd like to, I've spent some 30 minutes testing and sending "good" reports:
  • Bug #70629 - "innodb_stats_auto_recalc problem for InnoDB tables with persistent statistics"
  • Bug #70630 - "Why one can access persistent statistics data while they are changing?"
this time with a simple test case to copy/paste that demonstrated both the reason of the problem (persistent statistics is NOT recalculated immediately) and the way this badly influences join order selection by the optimizer even with just 2 tables. I had not even tried to do anything based on customer data - 2 simple tables and standard enough actions allowed to see the problem immediately.

Was that because I am smart or experienced in bugs processing? Not at all, my actions where really simple. Nice and smart test cases for the original  Bug #70617 were created and added by famous Shane Bester next day, probably as soon as he noted the bug (it was hard NOT to note it after my Facebook posts), along with some insights based on code review. As a result, original bug report was verified even before my detailed one with a simple test case.

To complete this story I've just added this documentation request, Bug #70741. Let's hope the manual will soon clearly describe how the background thread that re-estimates statistics really works and explain why one should run ANALYZE TABLE after any big change in data if he plans to run queries against the table immediately, no matter what the setting of innodb_stats_auto_recalc is.


Looks like I've spent more time arguing about the original bug report (both externally and internally) than any good engineer (Shane or me) needed to understand the problem based on description and create a test case showing something that is a problem and may be related to the original report. Even with time to report 3 more bugs taken into account, arguing and "Facebook escalations" took more...

What is the summary?

First of all, engineers who work on bugs should spend few minutes carefully reading and thinking about the problem described before ignoring it or blaming reporter for wrong attitudes just because he had no time to create a simple test case (or even did not wanted to do this, for whatever reasons). Bug report with clear problem statement and solutions found also presented is already good enough to think about it. Why would one assume that others are wasting time on bug reporting without really good reasons?

Bug reporters, on the other side, are surely able to get their bugs processed faster if, instead of explaining why they can not afford spending any more time on bug report and making various statements not directly related to the problem at hand, they spend some time creating a test case to copy/paste. Chances are high that original bug reporter understands the problem way better already than anybody else in the world, so why not to try to make the world better by contributing some more lines of text?

Saturday, September 28, 2013

Fun with Bugs #24 - PERFORMANCE_SCHEMA

It seems that one of my session proposals is accepted for Percona Live London 2013, so I have to prepare myself to speak about PERFORMANCE_SCHEMA new features and problems in MySQL 5.6. Bugs are going to be discussed, among other things. Let's check current active bugs (and some "Not a bug"s) related to PERFORMANCE_SCHEMA in this issue.

I'd like to start with Bug #68514 that got some attention this week again, in despite of its "Not a bug" formal status. Detailed instrumentation comes with a cost, and to reduce high CPU cost (reported as Bug #67736 by Domas at 5.6 RC stage) it was decided to allocate memory in bigger batches. It seems notable (I'd say unexpectedly high) amount of memory may be allocated for performance counters if you have max_connections > 1500 (or table_open_cache > 10000, or table_definition_cache > 10000, or open_files_limit > 30000). So, this is something to take into account if you plan to use MySQL 5.6 in a hope to "scale up" your instance (as performance_schema=1 there by default and thus extra memory is allocated by default).

Speaking about "Verified" bugs, there are only 7 of them now for MySQL 5.6. So it's easy to just list them all:
  • Bug #68413 - "performance_schema overhead is at least 10%". It can be more, but it took a lot of time and efforts from Mark Callaghan and me to force Oracle engineers to accept this claim as valid. Sometimes I feel that public bug reports for PERFORMANCE_SCHEMA from users are considered more as an insult than a useful contribution from community (as it happens with even minor InnoDB problem reports, for example). Anyway, we managed to prove the overhead can really be that big, and later Oracle's own performance guru Dimitri Kravtchuk confirmed in his Bug #70018 that for some use cases just having PERFORMANCE_SCHEMA enabled may cost you 30% decrease of QPS, even if you do not try to use it in any way.
  • Comparing to the above, Bug #69727 reported by Todd Farmer is a minor issue. It seems setting instruments for P_S as server startup options is not so obvious in some cases.
  • Bug #69782 - "Old files not being removed from performance_schema.file_instances ". This may eventually become a problem for MySQL instances that create binary logs or relay logs often.
  • Bug #69915 - "statement/com/Query counter doesn't increment". Yet another bug report from Todd Farmer, who uses P_S and blogs about it a lot. No wonder he finds bugs in the process. This one sounds simple, so I don't really understand why it is still "Verified".
  • Bug #70025 - "Update on P_S setup_consumers and threads through JOIN only updates first row". Minor enough problem was found by yet another my former colleague in Oracle, Jesper Krogh. Jesper also often writes about P_S in his blog and does us a favor by reporting bugs in public bugs database. The most recent (at the moment) still "Verified" P_S bug is also from him: Bug #70028 - "P_S threads.INSTRUMENTED not set according to setup_actors".
 As you can conclude from the above, PERFORMANCE_SCHEMA is created by Oracle engineers for their own use and they use it a lot. So, it is doomed to be useful, especially now, when MySQL 5.6 is becoming widely used by Oracle customers and MySQL community users in production. We can rely on Facebook in this area - as they already use P_S, they'll do their best to force Oracle MySQL engineers to make it as efficient as possible eventually.

It's also clear from the above that one of the main problems of PERFORMANCE_SCHEMA is its name. Users try to name it as P_S, PS, PFS - whatever abbreviation is used, it's better than the original name (even if it comes from some SQL standard that nobody really cares about...). Along with INFORMATION_SCHEMA it makes me wanting to see Oracle's good old public synonyms implemented in MySQL some day... Before that it seems that the only sane way to use P_S on a regular basis in interactive command line client is via ps_helper, created by the godfather of P_S, Mark Leith.

Saturday, September 21, 2013

It's all about bugs fixed: MySQL 5.6.14

Most of MySQL gurus and famous users are probably in San Francisco now, getting ready for fun at MySQL Connect. Part of that fun should come from the announcement of great new MySQL 5.6.14 release (that somewhat silently happened yesterday).

I am sitting at home though and I've seen at best 3 sunny days in September. The rest of the time it rains, so hardly I can do anything more funny and useful than review of MySQL bug reports even during my weekend. Let me try to tell you what MySQL 5.6.14 is really about and what you should expect from it based on the list of bugs fixed. Please, do not blame me if my summary would be different from the upcoming keynotes at MySQL Connect. It rains here...


I'll use good old approach of checking my older posts about bugs in MySQL 5.6.13 and comparing it to the release notes of MySQL 5.6.14. You have to read them carefully anyway to check if any of the bugs you cared about are fixed. Chances are real, with 67 or so bugs mentioned, but previous releases of MySQL 5.6 GA contained notably more fixes. Either release before MySQL Connect was a top priority or MySQL 5.6 is getting mature now, with less problems remaining to solve in new releases.

Let's check, starting from InnoDB bugs I had written about back in June. From the bugs I mentioned there I see no new fixes, even Bug #69179 (that I've mentioned many times and that should be already fixed in recent Percona Server 5.5.x) and Bug #69236 (single thread performance regression affecting Facebook) are still just "Verified". Bug #69325 is also still "Verified", even though at least documenting potential memory use by ALTER against a partitioned InnoDB table (as it was suggested by Shane back in May) would help a lot...

MySQL 5.6.13 was released less than 2 months ago, so surely one should not expect that all new bugs are fixed so fast. But I am really disappointed to see Bug #69892 (that questions the possibility of forced recovery if InnoDB persistent statistics was ever used and that I had written about here) still "Verified".

It can not be that bad, so I checked later post, and found one problem I've cared about solved in MySQL 5.6.14. The problem of improper use of InnoDB tables to store replication information (see Bug #69898) is fixed. Related Bug #69907 is still "Verified", even though recent comment claim it is a duplicate of the previous one and must be fixed in MySQL 5.6.14. Let's assume it's a matter of documenting or just a mistake. In any case, if you plan to use crash safe replication in MySQL 5.6, you should upgrade to 5.6.14 ASAP!

Of the InnoDB bugs I had not paid attention to recently that I'd like to mention the following ones fixed (quote from the Release Notes):
  • InnoDB; Partitioning: Following any query on the INFORMATION_SCHEMA.PARTITIONS table, InnoDB index statistics as shown in the output of statements such as SELECT * FROM INFORMATION_SCHEMA.STATISTICS were read from the last partition, instead of from the partition containing the greatest number of rows. (Bug #11766851, Bug #60071)
  • InnoDB: When logging the delete-marking of a record during online ALTER TABLE...ADD PRIMARY KEY, InnoDB writes the transaction ID to the log as it was before the deletion or delete-marking of the record. When doing this, InnoDB would overwrite the DB_TRX_ID field in the original table, which could result in locking issues. (Bug #17316731)
  • InnoDB: The row_sel_sec_rec_is_for_clust_rec function would incorrectly prepare to compare a NULL column prefix in a secondary index with a non-NULL column in a clustered index. (Bug #17312846)
  • InnoDB: An incorrect purge would occur when rolling back an update to a delete-marked record. (Bug #17302896)
  • InnoDB: An assertion would be raised in fil_node_open_file due to a missing .ibd file. Instead of asserting, InnoDB should return false and the caller of fil_node_open_file should handle the return message. (Bug #17305626, Bug #70007)
  • InnoDB: The assertion ut_ad(oldest_lsn <= cur_lsn) in file buf0flu.cc would fail because the current max LSN would be retrieved from the buffer pool before the oldest LSN. (Bug #17252421)
Bug #60071 was waiting since 5.5 GA times and it's great to see it fixed. Bug #70007, on the other hand, was reported just a month ago against 5.6.13. Both fixes should make my colleague Justin happy.

Other bugs sounds serious and show good work of MySQL QA in Oracle probably, but it's hard to say anything more than we see in the Release Notes as all the details are hidden in internal Oracle's bugs database.

Let's move on to replication bugs I mentioned in June. Bug #69444 is still "Verified". Bug #69135 from Giuseppe is till "Open". Even Bug #69097 is still "Verified", since April 30... Read Bug #69095 to find out why you should not switch replication format from STATEMENT to ROW if you use GTIDs and why failures in this case do not indicate any bug... I hope Giuseppe is satisfied (I am just not brave enough to recommend to switch to GTID-based replication in production, yet). Bug #68892 is closed, but it says the fix will go to MySQL 5.6.15...

To summarize, MySQL 5.6.14 had NOT solved any more of my replication-related concerns expressed back in June, comparing to 5.6.13. The only really great improvement is the fix for Bug #69898, as explained above. Your millage may vary though, as I see many fixes to bugs in internal bugs database related to Replication, like these (quote from the Release Notes):
  • Replication: When a master with semisynchronous replication enabled was shut down, the master failed to wait for either a semisyncnronous ACK or timeout before completing the shutdown. This prevented semisynchronous replication from reverting to asynchronous replication and allowed open transactions to complete on the master, which resulted in missing events on the slave.
    To fix this problem, dump threads are now stopped last during shutdown, after the client is told to stop, so that, if the dump thread has pending events from active clients, they can be sent to the slave. (Bug #16775543)
  • Replication: A session attachment error during group commit causes the rollback of the transaction (as intended), but the transaction in which this happened was still written to the binary log and replicated to the slave. Thus, such an error could lead to a mismatched master and slave.
    Now when this error occurs, an incident event is written in the binary log which causes replication to stop, and notifies the user that redundant events may exist in the binary log. An additional error is also now reported to the client, indicating that the ongoing transaction has been rolled back. (Bug #16579083)
  • Replication: START SLAVE failed when the server was started with the options --master-info-repository=TABLE relay-log-info-repository=TABLE and with autocommit set to 0, together with --skip-slave-start.
    A workaround for previous versions of MySQL is to restart the slave mysqld without the --skip-slave-start option. (Bug #16533802)
Probably I have to stop at this stage and try to get some real-life experience with MySQL 5.6.14...

To summarize: if you use MySQL 5.6 in production setup where replication is a key component, you should definitely upgrade to MySQL 5.6.14. Do not expect it to solve all the problems you cared about though, for me it was not the case.

Sunday, September 8, 2013

Fun with Bugs #23 - more on Optimizer bugs in MySQL 5.6

When I've sent CV to MySQL AB back in 2004 (or early 2005) I had actually wanted to become a developer there. As a person who just started to use MySQL on a regular basis and, at the same time, had to explain dozens of engineers per month how optimizers works in Oracle and Informix RDBMSes, and optimize queries for them from time to time, I was naturally interested in adding missing (but well known to me) features to MySQL optimizer (from hash joins to stored outlines, histograms and tracing, all things I've noted as extremely useful for many real life cases)... So, I wanted to work on MySQL optimizer specifically, if something related to MySQL at all.

It happened so that MySQL Support had somehow noted my CV before anybody else, so in few months and after some serious tests and long screening talks (the longest of them was with new, at the time, optimizer developer there) I've ended up in Support and (lucky I am) I've stuck in Bugs Verification Team there, and I had never regret about this... But interest to MySQL optimizer remained. So, soon enough, I've ended up as a person who worked on more optimizer bug reports than anybody else in Support and, later, I've become an "Optimizer Support Coordinator", whatever that meant. Among other things it meant very hot discussions with Optimizer team in development (now all of them whom I worked with are working on MariaDB...), that started with statements like "there are no bugs in Optimizer other than crashes" but ended as a useful cooperation, and mutual understanding (and maybe even some understanding from my side why optimizer really works the way it works, not much...)

Time flies, but I am still interested in MySQL optimizer improvements (it's still so far from what I took as granted in other databases even before 2005). That's why I was surprised and even felt deeply insulted to see just a dozen of visitors on a great session on optimizer development by MariaDB engineers at PLMCE 2013... That's also why I devote this issue of "Fun with Bugs" to recent optimizer bugs, again (just 2 months after previous issue on this topic).

So, make sure you do not miss these recently verified optimizer bugs affecting MySQL 5.6:
  • Bug #70247 - "Using many WHERE conditions makes range scan disabled". We have a hardcoded limit in this case, nothing related to number of rows in the table or even less cost. Take care if you use some software that creates long list of values to compare against with IN - you can easily end up with full table scan.
  • Bug #70236 - some note on debug builds, why should we even care? Well, when the most famous and productive bug reporter of all times reports something, we should pay attention. Impact is not clear, but this kind of bugs surely affects at least any reasonable and regular QA efforts. Also note version affected, 5.6.15 - it means that MySQL 5.6.14 is probably already cloned off, and the problem we see here will affect this release. So, whatever the problem is, it will be with us for 2-3 months for sure.
  • Bug #70220 - "Grouping with a view may report 'Invalid use of group function'". Yet another example of public bug report by Oracle engineer. This is great and what I've asked for in this blog more than once. Everybody should report any MySQL bug (with some exceptions for security bugs and bugs having customer-sensitive data) in public, at http://bugs.mysql.com, no matter what company he works for. I am happy to see this happening!
  • Bug #70038 - "Wrong select count distinct with a field included in two-column unique key". Take care while adding UNIQUE keys to InnoDB tables in MySQL 5.5 and 5.6 - this may lead to wrong results for some queries.
  • Bug #70021 - "Poor execution of a plan with unnecessary "range checked for each record". Looks like all versions may have problems with queries like this:

    select * from t2 left join t1 on (t1.key1 < 3 or t1.key1 > 1020) and t1.key2 < t2.a;
    
    That is, you've just added new index on key2 column, or added additional condition on indexed column, and get notably worse performance. Unexpected and weird, but this happens. Note also yet another example of a great report from MariaDB engineer here (hardly anybody in the world knows more about the way MySQL optimizer works, by the way). You may think whatever you want about Monty and his interviews about MariaDB, Oracle or MySQL future, but engineers working on MariaDB contribute a lot, especially in area of Optimizer. Do respect this, please.
  • Bug #70014 - "MySQL crashes on explain with JSON formatting (debug builds)". Again, any regular QA efforts are affected by this kind of bugs, so they should not be silently ignored because of low impact.
  • Bug #69841 - "SELECT COUNT(DISTINCT a,b) incorrectly counts rows containing NULL". Any good list of bugs should contain regression ones. Here we have a regression comparing to MySQL 5.1, that returns correct results. This is NOT the first optimizer regression comparing to 5.1, and surely not the first bug with DISTINCT. So, take extra care if you upgrade from 5.1 to latest and greatest MySQL 5.6 (or well tested MySQL 5.5, for that matter...) I am still going to proudly wear my MySQL 5.1 T-shirt on any public conference - speaking about quality of GA release, MySQL 5.1 was not better than MySQL 5.6, but we still see cases where it works better than any newer GA releases (unfortunately).
Let's stop at this point. Now I am back from vacation and celebrations of one year outside Oracle, so I plan to make regular posts about MySQL bugs again. Stay tuned!

Monday, August 12, 2013

MySQL Bugs Verification - Is It Really Simple?

While it was explained already by Sveta and others what does it really mean to "Verify" (or "Confirm", in Launchpad/Percona's terms) a bug in MySQL software, and why this step in a bug's life cycle is important, we still often read complains about too much time taken to verify the bug even with a clearly repeatable test case that can be just copy/pasted, like Bug #69985 or notably more serious Bug #69990. Moreover, I often make comments of this kind myself...

So, it seems there is still a need for clear explanation of all steps that may be involved in verification of a MySQL or Percona Server (let's take server as a most complicated case) bug. First of all, both Oracle and Percona require engineers who process bugs to check them on latest releases and/or source code builds of all versions/branches that are currently GA and fully supported, and on all development versions. For Oracle MySQL bug it means check of recent builds from current source code of 5.1, 5.5, 5.6 and 5.7 and, for many kinds of bugs like packaging, also on official binaries of 5.1.71, 5.5.33, 5.6.13 and 5.7.1 provided by Oracle for the platform. Some bugs may be clearly OS-specific even based on description, but others may require checks on Linux, Windows or even Solaris and FreeBSD (if they are NOT repeatable on Linux). Note that there are also different kinds of Windows and different Linux distributions, both 32-bit and 64-bit, and sometimes all these details matter. So, if engineer did not bother to check everywhere before setting bug to "Verified", bug may come back to him with a request from development for some more checks...

On top of that from time to time MySQL developers and QA start to care about regression bugs more than usual and as a result ask engineers who process bugs to try to pinpoint the exact release when supposedly regression bug appeared. Sometimes it mean checks of 3 previous official releases, sometimes it means separate detailed study that only really brave people like Shane (who probably has all releases since 3.23.5x installed and ready to start anyway) do... Note that all these is for a bug with a clear test case, while many bug reports are not that obvious.

Surely, a lot of checks can be automated (as Sveta explained) using smart setups, MTR, MySQL sandboxes and some shell scripting. But then even a small bug/problem in scripting may lead to bugs NOT checked on some important version (like it happened with MySQL 5.6 at pre-GA stage just because it was no longer mysql-trunk, but mysql-5,6, while scripts remain the same). And then you know what happens - people like me note this problem and Pandora's box is opened...

On top of that, every "Verified" MySQL bug in Oracle should be copied to internal bugs database, and at this step one has to run a script (that may have bugs also) from a Web-form providing MySQL bug number, then check copied bug in the internal bugs database, set proper status for it, make sure it had got proper category (as not every category at http://bugs.mysql.com is supported in internal Oracle's bugs database), got proper priority and ends up assigned to the developer lead who can really care about it. Some bugs should obviously be immediately escalated, and there was a separate procedure for this... At least this was the case a year ago.

So, even if I was able sometimes to "Verify" a bug in a matter of 15 minutes since it was reported, proper verification even of a simplest server bug usually takes more than that, even if it was immediately noted by the engineer who had nothing more important to do at the moment. By the way, bugs processing is hardly a 24x7 service in any company, so we should NOT expect some engineer to really monitor all incoming bugs in a real time on Sunday.

Sounds like a really over-complicated procedure, isn't it? Is it any different in Percona? Yes, it's easier here as there is no need to copy from one bugs database to the other - everything is in one place. We also do not release binaries for Windows, FreeBSD or Solaris and thus usually do not care about bugs on these platforms much unless they are repeatable on Linux. But on the other hand Percona provides repositories of RPM and .deb packages, and thus some bugs had to be checked on all recent major releases of RHEL/CentOS, Debian and Ubuntu (that are officially fully supported platforms here). I also have to check on all major versions of Percona Server, 5.1, 5.5 and 5.6 at the moment. On top of that, if bug is not clearly related to Percona-specific feature, we have to check upstream MySQL version and, if it is affected, we have to report and link upstream bug to the Percona server bug. So, again, often we in Percona end up working with two bugs databases, a lot of copy/pasting and following other annoying procedures... So, do not expect Percona server bug to be "Confirmed" in a matter of minutes or days, even if it comes with a simple test case repeatable on a recent version. So it goes.

Summary is simple: please, respect hard and often boring work of engineers who process bugs and give them some time before complaining (this is a reminder for myself as well).

If you care a lot about the bug, probably you should just open a support request with a vendor (I hope you have a support subscription, if you really care that much?) and then use your power of a customer to make things happen. If it does not work this way - tell me and let me open Pandora's box (or can of worms, if you prefer) at Facebook...

Friday, August 9, 2013

Fun with Bugs #22 - Some Bug Reports You Should Not Miss

Yet another user installed MySQL 5.5.32 yesterday and got a system that can not start... It's really easy to help in this case - just downgrade back to 5.5.31 or upgrade to 5.5.33 if you can. Why problem happened during upgrade? Because of a regression bug #69623.

This case that was easily solved during a quick chat reminded me about the problem of bugs in production. Nobody expects any sane DBA to review every new bug report, but some of them should not be missed, at least when upgrading to any newer version. Regression bugs (I see 15 here reported for MySQL 5.6 GA versions and still "Verified", and it was a search for "regression" tag that may not be always used...) are in this category, same as bugs in new features that may be just enabled/always there by default. Let me list a few more for 5.6.13:
  • Bug #69325 - "MySQL uses significantly more memory for ALTER TABLE than expected". Imagine you are trying to use more partitions than usual because MySQL 5.6 allows it, and plan to enjoy fast ALTER maybe while adding some indexes... just to end up swapping as crazy and everything hanging. Surprise...
  • Imagine you use replication in MySQL 5.6, with status stored in tables:

    mysql> show variables like '%info%';
    +---------------------------+----------------+
    | Variable_name             | Value          |
    +---------------------------+----------------+
    | master_info_repository    | TABLE          |
    | relay_log_info_file       | relay-log.info |
    | relay_log_info_repository | TABLE          |
    | sync_master_info          | 10000          |
    | sync_relay_log_info       | 10000          |
    +---------------------------+----------------+
    5 rows in set (0.02 sec)

    and just run CHANGE MASTER from time to time (or some tool may do this even if you do not know about it) and restart your server. You know what? You may easily end up with:
  • Bug #69825 - "InnoDB: Assertion failure in thread ... in file lock0wait.cc line 297", or
  • Bug #69898 - "change_master() invokes ha_innobase::truncate() in a DML transaction" and same assertion as above actually, and then upon restart...
  • Bug #69907 - "Error(1030): Got error -1 from storage engine" and no way to start up even with innodb_force_recovery maybe...
Why is that so? Probably at least partially because you blindly trusted Oracle MySQL 5.6 GA status and had not cared to monitor bug reports... I'll speculate about possible reasons in some other post.

Is there any way to prevent this kind of troubles? Nobody can guarantee bugs free releases for you, unfortunately, but monitoring bugs database for any new bugs or at least some other sources that do monitor bugs database, like this my blog or my Facebook page, give you notably more chances to prevent unexpected troubles. So, take care...

Sunday, August 4, 2013

Fun with Bugs #21 - recently verified bugs in MySQL 5.6.13

Notable contribution of MySQL Community to MySQL 5.6.13 was explicitly recognized recently. But users and contributors still continue their efforts, as well as Oracle engineers. Even though MySQL 5.6.13 has been generally available just for few days, we already have several new bug reports and updates to known bugs at http://bugs.mysql.com. Let me present a short list with some comments.

  • Bug #69915 is a great example of a "new thinking" inside Oracle. Todd Farmer does not only write about new ways to use PERFORMANCE_SCHEMA in MySQL 5.6 in his blog, but also reports bugs found in  the process to the public bugs database. This is what every responsible MySQL engineer in Oracle should do, if you ask me. Reporting bugs not related to customer confidential data or clear security issues to Oracle's internal bugs database only is a waste of additional time and efforts for all interested parties. This particular bug report is about statement/com/Query counter not incremented, but in this case I care more about the approach used (report bug in public) than the problem itself (even though it still shows that extra QA efforts are still needed for MySQL 5.6.x).
  • Bug #69895 - "mysql 5.6.13 i386 ships with 64bit libraries" on Solaris. Not that many people still care about Solaris these days, but for Oracle as a vendor of both it would make sense to care more about proper packaging on this platform. It's also a regression bug that again questions even basic QA testing of the releases...
  • Bug #69892 - "innodb stats interferes with innodb force recovery and drop/create tables". Shane is the most famous and productive bug reported over last 7 years, and he keeps up reporting bugs to public bug database. This one is serious enough, but I am more concerned about other recent bug report affecting MySQL 5.6.x that is still open: Bug #69907. It seems not only InnoDB statistics stored in tables, but also master and slave information stored in InnoDB tables may prevent any practical use of innodb_forced_recovery. This is pretty serious and now I wonder had anybody even tried to think about forced recovery while adding these new great features and more InnoDB tables to the "data dictionary"...
  • Bug #69887 - EXPLAIN for UPDATE of a single row by PRIMARY KEY shows access type as "range". This may be not even new and is just weird, but 5.6.13 is still affected as it was recently verified:

    mysql> explain select * from tbl_sample where id = 1\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: tbl_sample
             type: const
    possible_keys: PRIMARY
              key: PRIMARY
          key_len: 2
              ref: const
             rows: 1
            Extra: NULL
    1 row in set (0.00 sec)

    mysql> explain update tbl_sample set cnt = 1 where id = 1\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: tbl_sample
             type: range
    possible_keys: PRIMARY
              key: PRIMARY
          key_len: 2
              ref: const
             rows: 1
            Extra: Using where
    1 row in set (0.01 sec)
Does this small list mean that there no more bugs in MySQL 5.6.13, but the ones above? No, many old bugs are still not fixed, some bugs were verified by Oracle engineers on 5.6.13 many days before the official release. You may get information about some of them by following links in my previous post.

Wednesday, July 31, 2013

Fun with Bugs #20 - welcome MySQL 5.6.13!

MySQL 5.6.13 is released today! Installation is in progress right now, so I had not checked anything yet personally in reality, but we have release notes to study.

I'll base my quick review on my older posts devoted to known bugs in MySQL 5.6.12 in three main areas: InnoDB, optimizer and replication. All quoted text below is taken from the release notes.

Let's start with InnoDB. From my "top 10" list I only see the following bug fixed in 5.6.13:
  • Bug #69316. "Performance; InnoDB: A code regression introduced in MySQL 5.6 negatively impacted DROP TABLE and ALTER TABLE performance. This could cause a performance drop between MySQL Server 5.5.x and 5.6.x. (Bug #16864741, Bug #69316)"
Not much, really. But at least Bug #69623 is fixed (so you can use multi-file shared tablespace again) as expected and there are many other paragraphs in release notes speaking about InnoDB. Maybe it's only me having priorities not 100% corresponding to Oracle ones in this case.

Now, let's move to replication. From my "top 10" replication bugs in 5.6.12 I see the following fixed:
  • Bug #69369. "Replication: The condition leading to the issue fixed in Bug #16579083 continued to raise an error even though the condition itself no longer cause the issue to occur. (Bug #16931177, Bug #69369)."
  • Bug #69341 . "Replication: When rpl_semi_sync_master_timeout was set to an extremely large value, semi-synchronous replication became very slow, especially when many sessions were working in parallel. It was discovered that the code to calculate this timeout was inside the wait loop itself, with the result that an increase in the value of rpl_semi_sync_master_timeout caused repeated iterations. This fix improves the method used to calculate wakeup times, and moves it outside of the wait loop, so that it is executed one time only. (Bug #16878043, Bug #69341)"
  • Bug #69096 - this actually fixed a minor remaining problem of referencing non-existent session variable, GTID_NEXT_LIST. The real problem was already fixed in 5.6.12 (see Bug #69045). 
That's all. But 2 serious bugs I cared about (among many others replication related fixes, I see 10 or so) is not bad.

Finally, regression bugs in optimizer. The following are fixed in 5.6.13:
  • Bug #69471. "When selecting a union of an empty result set (created with WHERE 1=0 or WHERE FALSE) with a derived table, incorrect filtering was applied to the derived table. (Bug #69471, Bug #16961803)"
  • Bug #69410. "For queries with ORDER BY ... LIMIT, the optimizer could choose a nonordering index for table access. (Bug #69410, Bug #16916596)"
  • Bug #68897. "Some LEFT JOIN queries with GROUP BY could return incorrect results. (Bug #68897, Bug #16620047)".
3 out of 10 is good, but many optimizer regressions remain.

Installation finished, time for real life checks (and more posts based on results):

C:\Program Files\MySQL\MySQL Server 5.5\bin>mysql -uroot -proot -P3314 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 9
Server version: 5.6.13 MySQL Community Server (GPL)

Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>


MySQL 5.6.13 looks promising enough from the first sight,and I surely hope new MySQL Cluster 7.3 release based on it (instead of a really much worse 5.6.11) is coming soon.

Tuesday, July 30, 2013

What is the problem with "To be fixed later" bug status?

Bug #69842 was actively discussed on Facebook recently. Mostly not it's technical content - people do agree that InnoDB probably needs separate doublewrite buffer(s) for every possible InnoDB page size. It's more about bugs processing approaches, so I have to say something about this.

The story was simple enough. I've mentioned this bug in my previous post and yesterday my dear friend Sinisa added some comments and set status to "To be fixed later". This had made bug reporter (who co-incidentally is a "small data engineer" at Facebook) unhappy, because (let me quote):
"...bugs can still be closed as "to be fixed later", which means even it may affect other users (it affect us, at least), nobody will see that as an open issue MySQL has - especially that some of us would like to be fixed sooner rather than later."
This story had got a happy end actually. James Day stepped in and set status "Verified" for this report as a feature request. But question remains: was setting status to "To be fixed later" a proper action in this case, what does this status mean and does it lead to any problems?

Note in any case that "To be fixed later" status had been used in public bugs database for 8+ years at least (for me it was just always there), this is NOT a recent evil Oracle invention. Let me quote my old post explaining both this and "Won't fix" status (also mentioned in that recent discussion):
  • To be fixed later - it means that while the problem exists and verified, there is no way to fix it in current GA or development versions, or fix needs serious changes in software or data format, or serious development efforts and thus requires long term planning. Usually it means "To be fixed never".
  • Won't fix - developers just do not want to fix this. This is more a "feature" than a "bug", even if user thinks differently. Users often ask for some things that, while make sense for them and their use case, may be against SQL standard or have acceptable workarounds in frames of current implementation etc.
As for problems, Domas later claimed that "To be fixed later" bugs are not visible in searches later. I'd say it depends on how you search. Click here to get the list of all (234 at the moment of writing) bugs having "To be fixed later" status. The status is "terminating" though, and thus the bug in this status is not visible when you search for "Active" bugs. Just try to search for "truncate" among active bugs using bugs database search, and you will NOT find Bug #68184, while the problem of TRUNCATE being surprisingly slow for InnoDB table when buffer pool is big (while DROP is fixed long time ago) is known and important to solve. On the other hand, simple Google search for site:bugs.mysql.com truncate slow shows this bug to me on the first page of results. Note that by default bugs database does NOT include feature requests (S4 bugs) into the search, so current status of bug report from Facebook may be not much better for searches actually.

So, depending on the way you search, having bug with status "To be fixed later" may be a problem (but only if you are not experienced in MySQL bugs search). Any other problems? One is mentioned in my quote above: I made a statement that "To be fixed later" used to mean "To be fixed never". Is this a true statement or a bad joke? Both, to some extent.

Let me explain this. Ideally status "To be fixed later" should be set (for a bug report or feature request that was originally "Verified"!) by a development manager or key developer (like Marko in case of InnoDB) who checked all planned bug fixes for current GA versions, all planned bug fixes and new features for the version currently in development and, based on available resources and priorities, made a decision (as this is his job) to NOT include the fix in any version that is currently under development. Ideally, at the same time, worklog is to be created for this feature and made public, so that people can contribute with ideas, questions and express their need for this new feature (or bug fix).

So, can we even suspect anything as ideal as above in case of bug #69842? Obviously we can not. Even if we assume that Sinisa has the power of making decisions on further InnoDB development, hardy he spent enough time on Monday to make sure this does NOT fit into InnoDB plans for MySQL 5.7 (that is still at very early stage of development. He had not referenced any worklog created (and this is not so easy if at all possible in Oracle it seems to create new public worklogs). Surely bug reporter have some reasons to think that in this case "To be fixed later" may be equal to "To be fixed never"... I may be wrong here, but then I am sure somebody will explain this in public and give some insights on the huge amount of work entire MySQL organization in Oracle did before he sent his comments :)

Another potential problem with "To be fixed later" bugs, even if they had got this status with good reasons and after following the ideal procedure described above, is the following: who and when reviews bugs in these status again? They should be reviewed at least when development of new version starts, that is, recently when work started on 5.7 all old "To be fixed later" bugs had to be reviewed and some of them had to get some comments and status "Verified" back, if they are now considered for a new feature, or status "To be fixed later" re-established for 2 more years (assuming new GA release every 2 years based on recent Oracle habits and statements). Please, check the list of "To be fixed later" bugs and try to find any evidence of something like this happened any time since 2011.

I know for sure only one iteration like this happened, and I made it myself back in 2007 or so maybe, probably even before MySQL AB joined Sun (some time before 5.1 GA). I worked with Trudy (and maybe Peter G.), main MySQL architects of that times, and was able to re-open some of "To be fixed later" bugs, not many. I am not sure if anybody tried to repeat this again, especially recently in Oracle. Let me stay corrected...

Friday, July 26, 2013

Fun with Bugs #19 - waiting for MySQL 5.6.13 and some real fun?

I feel like MySQL 5.6.12 was released ages ago, while in reality it was on June 3, less than 2 months ago. No wonder I feel so, after writing several posts about bugs fixed and not fixed in it... Anyway, we still have to wait for MySQL 5.6.13 for a week or even two probably and in the meantime I decided to write new post for this series based on good old idea of making a digest of my recent bugs-related posts at Facebook. I know, it's boring and annoying (same as waiting for the release of 5.6.13).

Let's start with Bug #69846 - "ICP does not work on UNIQUE indexes". Based on my quick tests presented there I'd say that ICP (index condition pushdown) actually does not work for InnoDB clustered key (being it explicit PRIMARY KEY or just the first UNIQUE KEY), and this is probably by design, as such index is "equal" to InnoDB table itself. I'd still prefer to see this explained in the manual. Bug is still "Open", so Oracle engineers may have different opinion.

Another bug made my day actually, it's Bug #69842 from Domas. It does not matter that it is "Open" or that doublewrite buffer does not work efficiently with InnoDB page sizes smaller than 16K... Question is how to create proper Latin-based term for writing something 3, 5 ("quintuple-writing") or 17 times...

Optimizer in MySQL had always been a source of endless fun. It seems that for some cases it worked properly last time in 5.1 (if not 5.0). Check Bug #69833 for a recent example, "Bad interaction between MIN/MAX and "HAVING SUM(DISTINCT)": wrong results". Yes, it's a weird corner case with not much real life use, but still it's a regression bug. In some areas MySQL 5.1 was better than any 5.5 or 5.6 GA version from Oracle so far :)

Bug #62578 is still affecting customers who use MySQL 5.5.x. Recent checks show that 5.6.12 is NOT affected (neither Sinisa nor me were able to repeat it on 5.6.12), but mysql client of 5.5.32 still crashes when you reside terminal (or putty) window it runs in. Fix is well know and probably found its way to 5.6 long time ago, but why it is NOT in 5.5? Yes, I still remember talks about a policy to fix bugs in older GA versions only when customers ask and there is no risk or high efforts involved, but isn't this the case for this bug?

I've spent a lot of time trying to find similar known bug report for Bug #69825 reported by my colleague, but failed. It seems new problem, at least in public bugs database. Sounds simple, "InnoDB: Assertion failure in thread ... in file lock0wait.cc line 297", affects 5.6.12 and the only repeatable test case so far is based on customer confidential data (as far as I know), but still... Why it is still "Open" and got no comments? Had everybody in Oracle decided that GTID-based crash safe replication with information stored in InnoDB tables in mysql database just always work, so it must be some bug reporter fault? It's not the case, dear colleagues, it's not the case...

Now something simple and easy to verify by code review, Bug #69827. "Hardcoded libdir in cmake", so that lib64 on Linux is used only for x86_64. Who cares I've asked? But even if we do not take 64-bit PowerPC into account any more, some sources claim 64-bit ARM-based servers are coming soon... Should be easy fix actually (hacked in by many users already probably).

If you are going to upgrade to MySQL 5.6 and had forgotten that configuration file in basedir (if present) overrides settings in /etc/my.cnf, check Bug #68643 - "sql_mode is unkind in my.cnf created by mysql_install_db". Yes, as it silently includes STRICT_TRANS_TABLES. It's just a feature request for now, so hardly is going to change any time soon (if at all). Just take care about this, I warned you.

 Week started with funny bugs mostly. This one is open for more than 7 years and still scare people, Bug #18256. If session is killed you still may see messages like this in your mysql command line client:

ERROR 1053 (08S01) at line 1: Server shutdown in progress

Do not panic, please. Server may be perfectly alive, just your session no longer works.

On the weekend I still had fun with the manual and friends who are scared by Oracle requests for personal information. Check Bug #69805 I've reported (or http://falseisnotnull.wordpress.com/2013/07/21/mysql-not-found-errors-precedence-docs-bug/ for the whole story) and note that fine manual does not describe what was really implemented correctly. Let them now decide who was wrong, manual (I hope) or implementation of the feature as designed. Error handling in stored routines is still to complex and far from perfect even in MySQL 5.6 if you ask me, at least comparing to good old Oracle PL/SQL (this is what I really miss since 2005 sometimes...)

That's all for now. I've spent this week digging around and trying to fix things and protect strangers from dangers like this:


Hope this helps somebody.

Sunday, July 21, 2013

Fun with Bugs #18 - Feature Requests (Oldies but Goldies Part II)

In the previous "Fun with Bugs" posts I've mostly ignored feature requests. Users do file a lot of feature requests in MySQL bugs database, but until recently (when "Affects Me" button was introduced) there was no clear way to even try to influence the priority of the feature in development plans. There is still no way to see if the feature request has any priority. Surely, based on Oracle policies, nobody from Oracle will even try to give you a hint on when the "Verified" feature request is going to be implemented or what is its real internal priority...


Does it mean that there is no sense to make feature requests (or, for Oracle engineers, to process them and keep status in sync with reality)? No, it does not, IMHO. This is still one of few ways for a community user or even a customer to influence the future of MySQL. Even if Oracle will not start to act immediately in any obvious way based on your feature request, it will still be documented, reviewed by Oracle engineers, other community users and maybe even developers from other forks. Entire world will know what is missing (and accepted as missing by vendor, if feature request is "Verified"). Then, based on its status changes, entire world will know if anybody cares enough...




So, let me continue my previous post about oldest but still formally "Verified" bugs with a list of 12 oldest and still "Verified" feature requests:
  1. Bug #400 is the oldest and it remains "Verified" since good old days when Monty himself replied to MySQL feature requests. It's about backslash (\) usage as escape character before single and double quotes. I do not have any opinion about it, but still it's the oldest still "Verified" feature request.
  2. Bug #765 - "mysqlimport should read from stdin standard input". Maybe, it's a kind of a "Unix way" to have some option to do so...
  3. Bug #1118  - "Allow multiple concurrent locks with GET_LOCK()". This one from Dean Ellis I'd like to see implemented years ago. And it was implemented recently by former famous MySQL developer, Konstantin Osipov, who had made a BSD-licensed patch available here. Feature request is still just "Verified", bit this is how it may work: even if not Oracle, somebody can pick up useful feature request and implement it.
  4. Bug #1154 - "cannot rename temporary table". I really wonder why not to make RENAME working the same way as ALTER TABLE ... RENAME in this case. Nobody wants to touch old code and test cases maybe...
  5. Bug #1207 - this is a request for function to count the number of substring occurrences in a string, like PHP's substr_count(). Workarounds are more or less obvious, so this feature request may hang around not implemented for 10 more years easily...
  6. Bug #1214 - "Unique row identifier data type for MySQL (like MSSQL uniqueidentifier)". UUID() mostly works for this if MIXED/ROW-based replication is used, but still this concern expressed by James Day remains:

    "Decision needed for this bug is if UUID() should be our answer to a unique identifier feature request in 5.1 and later. It's very non-optimal for InnoDB primary key use because it puts the most rapidly changing parts of the time first."
    Check the bug report for different workarounds, but ultimate solution is not yet in place it seems.
  7. Bug #1275 - "Multiple uninstall options if upgrade a preexisting install in same directory". This is a Windows-specific feature request and I am not sure what's going on now with new installers. I had never really tried to upgrade in place in real life using .msi or new installer (if maybe to verify some bug report). It's surely better to install every version into its own directory and then play with datadir in my.ini to make upgraded MySQL work with data from previous version.
  8. Bug #1309 - "EXCEPT statement". One of the early requests to have all kind of relational operations requested by standard in MySQL. Of them we still have only UNION [ALL].
  9. Bug #1310 - "optionally add queries with lock times to slow_query log". I assumed this is already implemented, but I may be wrong...
  10. Bug #1327 - "I would like functions/SQL commands in the MySQL API/Language so that a client application can be notified when things happen in the database." Now I am not sure if we need to implement anything for this, but back in 2005 I thought there is something to consider here. Still "Verified" since that time.
  11. Bug #1341 - "InnoDB ibdata1 never shrinks after data is removed". Still the case. Setting innodb_file_per_table=1 and using separate tablespaces for undo in MySQL 5.6 partially solves the problem (by removing reasons for ibdata1 to grow much). But users still complain, for 10 years already...
  12. Bug #1343 - "index by date part of datetime field". I've suggested some kind of a workaround there back in 2005.
I think we had enough examples to understand that both reporting and reading feature requests is useful - you can get a workaround or even some kind soul providing a patch, even for the feature request that is still "Verified".