Sunday, January 22, 2017

oprofile Basics for MySQL Profiling

In my previous post I've presented a list of bugs reported by famous MySQL developers, DBAs, users and support engineers who used oprofile to find the problem or root cause of the problem they studied. I assume that some of my readers would like to follow their steps and try to use this tool while working on MySQL performance problems. Bug reports I mentioned contain outputs and, rarely, just some commands used. That's why I decided to write a kind of a basic "Howto" document. The tool does have a detailed enough manual, but few basic steps for quick use of oprofile to study MySQL-related problems should still help. At least I had always considered this old post by Oli a useful reference for myself and Googled for it more than once.

These days users of oprofile may need an updated quick reference or Howto guide, as starting from version 0.9.8 the tool started to change, new commands were introduced and, starting from version 1.0.0, familiar commands like opcontrol are not only deprecated, but removed from the tool (and it means that pt-stalk's --collect-oprofile option just does not work any more). For this post I decided to use version 0.9.9 on my good old Ubuntu 14.04 netbook. This should, theoretically, allow me to demonstrate both "old" and "new" way to use the tool.

So, I am using the following system for the purposes of this blog post:
openxs@ao756:~/dbs/maria10.1$ uname -a
Linux ao756 3.13.0-100-generic #147-Ubuntu SMP Tue Oct 18 16:48:51 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
openxs@ao756:~/dbs/maria10.1$ cat /etc/issueUbuntu 14.04.5 LTS \n \l

openxs@ao756:~/dbs/maria10.1$ dpkg -l | grep -i oprofile
ii  oprofile                                              0.9.9-0ubuntu8                                      amd64        system-wide profiler for Linux systems
openxs@ao756:~/dbs/maria10.1$ bin/mysql -uroot -e"show variables like '%version%'"
+-------------------------+------------------+
| Variable_name           | Value            |
+-------------------------+------------------+
| innodb_version          | 5.6.31-77.0      |
| protocol_version        | 10               |
| slave_type_conversions  |                  |
| version                 | 10.1.18-MariaDB  |
| version_comment         | MariaDB Server   |
| version_compile_machine | x86_64           |
| version_compile_os      | Linux            |
| version_malloc_library  | bundled jemalloc |
| version_ssl_library     | YaSSL 2.3.9b     |
| wsrep_patch_version     | wsrep_25.16      |
+-------------------------+------------------+
openxs@ao756:~/dbs/maria10.1$ ps aux | grep mysqld
openxs   25568  0.0  0.0   4440   784 pts/10   S    19:00   0:00 /bin/sh bin/mysqld_safe --no-defaults
openxs   25631  1.1  2.4 751552 93596 pts/10   Sl   19:00   0:22 /home/openxs/dbs/maria10.1/bin/mysqld --no-defaults --basedir=/home/openxs/dbs/maria10.1 --datadir=/home/openxs/dbs/maria10.1/data --plugin-dir=/home/openxs/dbs/maria10.1/lib/plugin --log-error=/home/openxs/dbs/maria10.1/data/ao756.err --pid-file=ao756.pid
openxs   26158  0.0  0.0  14636   968 pts/10   S+   19:31   0:00 grep --color=auto mysqld
I assume you already have the oprofile package installed. I use MariaDB 10.1.18 built from source on this box (I work for MariaDB after all).

Basic steps to use the tool for studying some problematic period of MySQL server are the following:
  1. We check version used and, if we want to use older version of the tool and daemon-based approach, we initiate and start the process of samples collection using opcontrol tool:

    openxs@ao756:~/dbs/maria10.1$ opcontrol --version
    opcontrol: oprofile 0.9.9 compiled on Apr  4 2014 23:34:07
    openxs@ao756:~/dbs/maria10.1$ opcontrol --init
    Kernel support not available, missing opcontrol --init as root ?
    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --init
    [sudo] password for openxs:
    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --start --no-vmlinux
    ATTENTION: Use of opcontrol is discouraged.  Please see the man page for operf.
    Using default event: CPU_CLK_UNHALTED:100000:0:1:1
    Using 2.6+ OProfile kernel interface.
    Using log file /var/lib/oprofile/samples/oprofiled.log
    Daemon started.
    Profiler running.

    Note that superuser permissions are needed for most opcontrol commands. I've added --no-vmlinux command as I do not want to profile the kernel itself.
  2. Now we can run with samples collection activated for some time, until the period we want to study ends. For this post I use a primitive single thread "benchmark" test from the Bug #39630 I once reported and worked on for a long time:

    openxs@ao756:~/dbs/maria10.1$ time bin/mysql -uroot -e'select benchmark(500000000, 2*2)'
    +---------------------------+
    | benchmark(500000000, 2*2) |
    +---------------------------+
    |                         0 |
    +---------------------------+

    real    0m20.209s
    user    0m0.009s
    sys     0m0.004s


    This was a deliberately very simple use case to study, just to check that we get meaningful and expected outputs.
  3. When the period we want to study ends, we should stop collecting samples and dump the results:

    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --stop
    Stopping profiling.
    openxs@ao756:~/dbs/maria10.1$ sudo opcontrol --dump
    openxs@ao756:~/dbs/maria10.1$


    Theoretically we can have several sample-collecting "sessions" that we can save under unique "names" etc, but for now I want to concentrate on the most basic usage.
  4. Now we have to use opreport tool to check the results:

    openxs@ao756:~/dbs/maria10.1$ sudo opreport --demangle=smart --symbols --merge tgid `pwd`/bin/mysqld | head -12
    Using /var/lib/oprofile/samples/ for samples directory.
    warning: /no-vmlinux could not be found.
    warning: [vdso] (tgid:25631 range:0x7ffc6fac3000-0x7ffc6fac5000) could not be found.
    CPU: Intel Sandy Bridge microarchitecture, speed 1.5e+06 MHz (estimated)
    Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
    samples  %        image name               symbol name
    83358    28.0632  mysqld                   Item_func_mul::int_op()

    58838    19.8083  mysqld                   Item_func_benchmark::val_int()
    55420    18.6576  mysqld                   Item_hybrid_func::result_type() const
    52001    17.5066  mysqld                   Item_func_hybrid_field_type::val_int()
    25931     8.7299  mysqld                   Item_int::val_int()
    10861     3.6564  mysqld                   Type_handler_int_result::cmp_type() const
    10104     3.4016  mysqld                   Type_handler_int_result::result_type() const
    460       0.1549  no-vmlinux               /no-vmlinux
    12        0.0040  mysqld                   sync_array_print_long_waits(unsigned long*, void const**)
    openxs@ao756:~/dbs/maria10.1$


    We provided --symbols option to list per-symbol information instead of a binary image summary, --demangle=smart option to get readable names of C++ functions, and --merge tgid option to merge profiles by thread group ID (for all threads of the same multi-threaded process, like mysqld). We also had to provide a full path to the mysqld binary to restrict symbol summary only to the specific executable image, mysqld process from MariaDB 10.1.18.When mysqld from PATH or just from some standard package is used, $(which mysqld) is used instead in the command line.
Basically that's all, we clearly see that most of the time was spent on integers multiplication (as one had to expect). We may have to use --deinit, --reset or --shutdown options of opcontrol if we do not plan any further profiling.

Now, what if we use oprofile version 1.0.0 or newer (as it happened to me on Fedora 25 or Ubuntu 16.04)? opcontrol tool just does not exist then, as well as oprofiled daemon I see running now:


openxs@ao756:~/dbs/maria10.1$ ps aux | grep oprofiled
root     27281  0.0  0.2  14036  7852 ?        S    20:22   0:00 /usr/bin/oprofiled --session-dir=/var/lib/oprofile --separate-lib=1 --separate-kernel=1 --separate-thread=1 --separate-cpu=0 --events=CPU_CLK_UNHALTED:60:0:100000:0:1:1, --no-vmlinux
openxs   27446  0.0  0.0  14636   972 pts/10   S+   20:41   0:00 grep --color=auto oprofiled
openxs@ao756:~/dbs/maria10.1$



Basic steps are the following, based on operf tool (surely without oprofiled running):

openxs@ao756:~/dbs/maria10.1$ operf -p `pidof mysqld` &
[2] 4359
openxs@ao756:~/dbs/maria10.1$ operf: Press Ctl-c or 'kill -SIGINT 4359' to stop profiling
operf: Profiler started

openxs@ao756:~/dbs/maria10.1$ time bin/mysql -uroot -e'select benchmark(500000000, 2*2)'
+---------------------------+
| benchmark(500000000, 2*2) |
+---------------------------+
|                         0 |
+---------------------------+

real    0m19.772s
user    0m0.006s
sys     0m0.005s
openxs@ao756:~/dbs/maria10.1$ kill -SIGINT 4359
openxs@ao756:~/dbs/maria10.1$
Profiling done.

[2]+  Done                    operf -p `pidof mysqld`
openxs@ao756:~/dbs/maria10.1$ opreport --demangle=smart --symbols --merge tgid /home/openxs/dbs/maria10.1/bin/mysqld
Using /home/openxs/dbs/maria10.1/oprofile_data/samples/ for samples directory.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:4270 range:0x7ffebbf38000-0x7ffebbf39fff) could not be found.
CPU: Intel Sandy Bridge microarchitecture, speed 1.5e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        image name               symbol name
132      70.5882  no-vmlinux               /no-vmlinux
13        6.9519  libc-2.19.so             memset
10        5.3476  mysqld                   os_aio_linux_handle(unsigned long, fil_node_t**, void**, unsigned long*, unsigned long*)
9         4.8128  mysqld                   sync_array_print_long_waits(unsigned long*, void const**)
3         1.6043  libpthread-2.19.so       __pthread_mutex_cond_lock
2         1.0695  libpthread-2.19.so       pthread_self
2         1.0695  mysqld                   btr_defragment_thread
2         1.0695  mysqld                   srv_error_monitor_thread
1         0.5348  [vdso] (tgid:4270 range:0x7ffebbf38000-0x7ffebbf39fff) [vdso] (tgid:4270 range:0x7ffebbf38000-0x7ffebbf39fff)
1         0.5348  libaio.so.1.0.1          /lib/x86_64-linux-gnu/libaio.so.1.0.1
1         0.5348  libc-2.19.so             __libc_disable_asynccancel
1         0.5348  libc-2.19.so             __select_nocancel
1         0.5348  libc-2.19.so             fflush
1         0.5348  libpthread-2.19.so       pthread_cond_timedwait@@GLIBC_2.3.2
1         0.5348  mysqld                   buf_flush_page_cleaner_thread
1         0.5348  mysqld                   buf_pool_get_oldest_modification()
1         0.5348  mysqld                   log_checkpoint(unsigned long, unsigned long, unsigned long)
1         0.5348  mysqld                   mutex_enter_nowait_func(ib_mutex_t*, char const*, unsigned long)
1         0.5348  mysqld                   srv_check_activity(unsigned long, unsigned long)
1         0.5348  mysqld                   srv_master_evict_from_table_cache(unsigned long)
1         0.5348  mysqld                   srv_master_thread
1         0.5348  mysqld                   ut_delay(unsigned long)
openxs@ao756:~/dbs/maria10.1$


Note that, theoretically, operf allows to profile separate commands or user's own process (as I did with -p `pidof mysqld` option) as non-privileged user. I had not used sudo above. Also, we just need one command now and running operf is stopped with -SIGINT signal when we are done with profiling.

The results I've got from opreport in this case, though, do not look 100% trustful to me though. Not because of that warning from [vdso], we had similar (although for other addresses) with classic approach as well (and you can find some related information in the FAQ), but because I see totally different profile than before under basically the same load. I see activity by background functions, but nothing related to multiplication of integers or benhcmark() function. This is suspicious and, until further tests on Fedora 25, I'd prefer NOT to trust to operf from oprofile version 0.9.9 on Ubuntu 14.04...

Next I plan to check how operf in the latest oprofile version works on Fedora 25 and (maybe) Ubuntu 16.04, apply profiler to a more interesting use case(s), show more options including --callgraph option, and try to compare recent oprofile to perf profiler (that I currently trust more and find somewhat easier to use for MySQL-related tasks). Stay tuned!

Saturday, January 21, 2017

Fun with Bugs #47 - On Some Bugs Found Using oprofile

Users had to identify the reasons for MySQL performance problems somehow well before famous Performance Schema appeared in MySQL 5.5.3, and even before Domas invented his Poor Man's Profiler. Poor users had to use some OS-level tools, and among these tools the most important one was, probably, oprofile.

oprofile is a system-wide statistical profiling tool for Linux available since 2001 and 2.4.x kernels. It was applied to MySQL for many years and allowed to find and explain many performance problems. We even have the --collect-oprofile option for pt-stalk tool, and it's a common knowledge that Percona Toolkit includes tools and options that are proved to be useful for real life problems more than once... So, we can safely assume that oprofile is (or, at least, was) an essential tool for MySQL performance problem solving.

For this post I decided to check and pick up some of MySQL bugs identified or verified with a little help from oprofile. Here is the list of most important ones (from my point of view), starting from the oldest:
  • Bug #29921 - "MYSQLparse uses 3X more CPU in MySQL 5.0 than yyparse in MySQL 4.0". It was reported by Mark Callaghan and eventually fixed (to large extent) in versions 5.0.50 and 5.1.23. While gprof was used by developers for further analysis, the reason for performance repression was demonstrated and proved with oprofile.
  • Bug #31551 - "MyISAM poor performance with multiple queries running concurrently". Reported by Vadim Tkachenko back in 2007, this bug (that is essentially about key cache mutex contention problem, that was demonstrated using oprofile) is still "Verified". Nobody seems to care to fix it, and as a workaround one may have to use multiple key caches in concurrent environment.
  • Bug #33948 - "performance issue when using sysbench benchmark on a multiple-core system". It was reported by Terry Tao, who found it using oprofile:
    "Used oprofile to trace it then found it spended more time on searching a free table entry from open_cache. "
    and suggested a patch, but the real solution based on the patch was made in frames of work on several worklogs in 5.4 (and 5.5.x) related to LOCK_open and Metadata Locks.
  • Bug #43161 - "buf_calc_page_new_checksum and ut_2_power_up use a lot of CPU on busy IO loads". It was reported by Mark Callaghan and later claimed a duplicate of his Bug #54814. Sveta Smirnova verified it using oprofile. The fix was contributed by Laurynas Biveinis for 5.6 and appeared only in MySQL 5.7.4.
  • Bug #44723 - "Larger read_buffer_size values can cause performance decrease for INSERTs", was reported by Harrison Fisk, who used oprofile on Linux to demonstrate the problem. The bug is fixed since MySQL 5.0.86, 5.1.39, and 5.4.4.
  • Bug #49047 - "InnoDB deadlock detection is CPU intensive with many locks on a single row". It was reported by Harrison Fisk and was partially fixed based on the patch he suggested in the InnoDB Plugin for MySQL 5.1 (available after 5.1.50 or so), and in MySQL 5.5. But note that innodb_deadlock_detect server variable was added recently (in MySQL 5.7.15!) to disable deadlock detection entirely. oprofile (along with DTrace) was used to identify the bug.
  • Bug #49169 - "read_view_open_now is inefficient with many concurrent sessions", was reported by Mark Callaghan. oprofile helped to find out that a lot of time is spent in this function. The bug is closed as "It was fixed as part of the kernel mutex split in 5.6.", but check last comments for some discussion of what was and was not fixed.
  • Bug #49177 - "MyISAM share list scalability problems". This bug was reported by Sergei Golubchik. Alexey Stroganov later posted a detailed analysis based on oprofile outputs, among other things. The bug was completely fixed only recently, in MySQL 5.7.9+ and MySQL 8.0.
  • Bug #53825 - "Removing locks from queue is very CPU intensive with many locks", was reported by Harrison Fisk. It is still "Verified" and thus, probably is not fixed. Note that it may become serious bottleneck in 5.7.15+ with deadlock detection disabled. I plan to double check this assumption with profilers soon.
  • Bug #56332 - "Performance regression in DROP TABLE performance post 5.0.77", was reported by Justin Swanhart. The problem is definitely fixed in recent 5.5.x and 5.6.x versions (but not in the older ones probably, there were more regressions later in the progress). This bug report contains some useful examples of opreport commands.
  • Bug #58037 - "LOCK_open contention from the table cache". This bug was reported by Mark Callaghan. The problem was well known in the past. Alexey Stroganov proved (with oprofile and scalability testing) that the problem was resolved with introduction of metadata locks in 5.5.x.
  • Bug #61401 - "UPDATE performance degrades gradually if a trigger exists", was reported by Aurimas Mikalauskas. He used oprofile to prove that initial implementation of MDL led to clear performance regression comparing to 5.1.x.
  • Bug #64304 - "assemble to implement procedure of transforming between big-end and little-end", was reported by Xiaobin Lin, who suggested a patch and proved with oprofile that it helps. The bug is still "Verified". Probably I have to talk to Marko Mäkelä (who is now my colleague in MariaDB!) about it soon...
  • Bug #68818 - "Large Value List in WHERE Clause Spends Too Much Time in 'statistics' State", was reported by Jervin Real. He used oprofile to prove his point and show where exactly in the code (as opposed to just "statistics" stage) the time was spent. Eventually this was declared a duplicate of Bug #68046 that is fixed since 5.6.11.
  • Bug #72115 - "Very slow create/alter table with partitions and STATS_PERSISTENT=1", was reported by my colleague Jan Lindstrom in 2014. It seems that even with other tools already around, including SHOW PROFILE, Performance Schema and perf profiler, oprofile is still sometimes used to show where the time is spent. After several problems and regressions identified, it seems the problem is mostly resolved in 5.6.17+. I plan to double check this some day.
We clearly see that famous MySQL developers, users, DBAs and support engineers had used oprofile successfully while working on performance related problems. In one of the upcoming posts (while preparing for my FOSDEM 2017 talk) I plan to explain how to use it, both older versions and 1.x.y ones, that rely on the same kernel features as perf profiler now and changed approaches/commands available a lot.

Saturday, December 31, 2016

Fun with Bugs #46 - On Some Bugs I've Reported During the Year of 2016

It's time to summarize the year of 2016. As a kind of a weird summary, in this post I'd like to share a list of MySQL bug reports I've created in 2016 that are still remaining "Verified" today:

  • Bug #79831 - "Unexpected error message on crash-safe slave with max_relay_log_size set". According to Umesh this is not repeatable with 5.7. The fact that I've reported the bug on January 4 probably means I was working at that time. I should not repeat this mistake again next year.
  • Bug #80067 - "Index on BIT column is NOT used when column name only is used in WHERE clause". People say the same problem happens with INT and, what may be even less expected, BOOLEAN columns.
  • Bug #80424 - "EXPLAIN output depends on binlog_format setting". Who could expect that?
  • Bug #80619 - "Allow slave to filter replication events based on GTID". In this feature request I've suggested to implement filtering by GTID pattern, so that we can skip all events originating from specific master on some slave in a complex replication chain.
  • Bug #82127 - "Deadlock with 3 concurrent DELETEs by UNIQUE key". It's clear that manual is not even close to explaining how the locks are really set "by design" in this weird case. See comments in MDEV-10962 for some explanations. Nobody from Oracle event tried to really explain how things are designed to work.
  • Bug #82212 - "mysqlbinlog can produce events larger than max_allowed_packet for mysql". This happens for encoded row-based events. There should be some way to take this overhead into account while creating binary log, IMHO.
  • Bug #83024 - "Internals manual does not explain COM_SLEEP in details". One day you'll see Sleep for some 17 seconds logged into the slow query log, and may start to wonder why...
  • Bug #83248 - "Partition pruning is not working with LEFT JOIN". You may find some interesting related ideas in MDEV-10946.
  • Bug #83640 - "Locks set by DELETE statement on already deleted record". This case shows that design of locking in InnoDB does produce really weird outcomes sometimes. This is not about "missing manual", this is about extra lock set that is absolutely NOT needed (a gap X lock on a record in the secondary unique index is set when the same transaction transaction already has the next key lock on it). As a side note, I keep finding, explaining and reporting weird or undocumented details in InnoDB locking for years, still my talk about InnoDB locks was not accepted by Oracle once again for OOW in 2016. What do I know about the subject and who even cares about those locks... 
  • Bug #83708 - "uint expression is used for the value that is passed as my_off_t for DDL log". I was really shocked by this finding. I assumed that all uint vs unsigned long long improper casts are already found. It seems I was mistaking.
  • Bug #83912 - "Time spent sleeping before entering InnoDB is not measured/reported separately". The use case that led me to reporting this bug is way more interesting than the fact that some wait is not instrumented in performance_schema. You may see more related bug reports from me next year.
  • Bug #83950 - "LOAD DATA INFILE fails with an escape character followed by a multi-byte one". This single bug (and related bugs and stories) were original topic for issue #46 of my "Fun With Bugs" series. I was not able to write everything I want properly over last 3 weeks, but trust me: it's a great story, of "Let's Make America Great Again" style. With the goal for LOAD DATA to behave exactly as INSERT when wrong utf8 data are inserted, Oracle changed the way LOAD DATA works back and forth, with the last change (back) happened in 5.7.17:
     "Incompatible Change: A change made in MySQL 5.7.8 for handling of multibyte character sets by LOAD DATA was reverted due to the replication incompatibility (Bug #24487120, Bug #82641)"
    I just can not keep up with all the related fun people have in replication environments thanks to these ongoing changes... It's incredible.
  • Bug #84004 - "Manual misses details on MDL locks set and released for online ALTER TABLE". Nothing new: locks in MySQL are not properly/completely documented, metadata locks included. yes, they are documented better now, after 11+ years of my continuous efforts (of a kind), but we are "not there yet". I am still waiting for a job offer to join MySQL Documentation Team, by the way :)
  • Bug #84173 - "mysqld_safe --no-defaults & silently does NOT work any more". Recent MySQL 5.7.17 release had not only given us new Group Replication plugin and introduced incompatible changes. In a hope to fix security issues it comes with pure regression - for the first time in last 11 years mysqld_safe --no-defaults stopped working for me! By the way, mysqld_safe is still NOT safe in a sense that 5.7.17 tried to enforce, and one day (really soon) you will find out why.
  • Bug #84185 - "Not all "Statements writing to a table with an auto-increment..." are unsafe". If you do something like DELETE FROM `table` WHERE some_col IN (SELECT some_id FROM `other_table`) where `table` has auto_increment column, why should anyone care about it? We do not generate the value, we delete rows...
    This bug report was actually created by Hartmut Holzgraefe and test case comes from Elena Stepanova (see MDEV-10170). I want to take this opportunity to thank them and other colleagues from MariaDB for their hard work and cooperation during the year of 2016. Thanks to Umesh (who processed most of my bug reports),  Sinisa Milivojevic and Miguel Solorzano for their verifications of my bug reports this year.

In conclusion I should say that, no matter how pointless you may consider this activity, I still suggest you to report each and every problem that you have with MySQL and can not understand after reading the manual, as a public MySQL bug. Now, re-read my 4 years old post on this topic and have a Happy and Fruitful New Year 2017!

Saturday, December 3, 2016

MySQL Support Engineer's Chronicles, Issue #4

This week I had to deal with some unusual problems. But let me start with Percona's xtrabackup, software that I consider a key component of many current production MySQL setups and use regularly. Recently new minor versions of XtraBackup were released, check the details on 2.4.5, for example. It made a step towards support of MariaDB 10.2, but it's still a long way to go, see this pull request #200.

My main problem with xtrabackup, though, is not with lack of support of MariaDB 10,2-specific features. Why should they care, after all... The problem is that old well known bugs and problems are not resolved, those that may affect all MySQL versions, forks and environments. Check lp:1272329 , "innobackupex dies if a directory is not readable. i.e.: lost+found", for example. Why not to read and take into account ignore_db_dir option (as server does) and let those poor souls who used mount point as a datadir to make backups? Check even older problem, passwords that are not hidden in the command lines, see lp:907280, "innobackupex script shows the password in the ps output, when its passed as a command line argument". My colleague Hartmut even suggested the fix recently, see pull request #289.

Because of these old, known problems (some of them being low hanging fruits) that are not fixed users still suffer while using xtrabackup way more often than they would like to. One day, as a result, they'll have to switch to some other online backup tools or approaches. One may dream about extended myrocks_hotbackup to cover InnoDB one day (when MyRocks and InnoDB will work together in one instance), or just use Percona TokuBackup (after adding script to go SST for Galera with it, maybe), or try something totally different. Anyway, I feel that if more bugs (including low hanging fruits) in xtrabackup are not getting fixed and pull requests are not actively accepted, the tool may become much less relevant and used soon.

I had to deal with MaxScale-related issues this week, so I'd like to remind those who use Docker for testing about https://github.com/asosso/maxscale-docker/blob/master/Dockerfile. Personally I prefer to build from source. In any case, I'd like us all to remember that in older versions one may have to set strip_db_esc option explicitly for service to deal with database names containing underscore (_). Recent 2.0.x versions have it enabled by default (see MXS-801).

I also had to explain how online ALTER TABLE works, specifically, when it sets exclusive metadata locks in the process. I still do not see this topic properly explained in the manual, so I had to report Bug #84004, "Manual misses details on MDL locks set and released for online ALTER TABLE".


By no means I am a developer for 11 years already, even less one should expect writing Java code from me. Anyway, I had to explain how to replace Oracle's ref_cursors (a.k.a cursor variables) in MySQL, both in stored procedures and in Java code that calls them. If you are wondering what is this about, check this fine manual. Note that this feature is missing in MySQL, even though it was suggested to implement it here. In general, MySQL allows just to run SELECTs in stored procedures and then in Java you can process each of the result sets returned any way you want. Things may get more complicated when more than one result set is produced, and they are even more complicated in Oracle with nested cursor expressions. So, I plan to devote a separate blog post to this topic one day. Stay tuned.

I answer questions coming not only from customers. Old friends, community users out of nowhere and, even more, colleagues are welcomed to discuss whatever MySQL- or MariaDB-related  problem they may have. If I know how to help, I'll do this, otherwise I'll quickly explain that I am of no good use. This is how I ended up testing MariaDB's CONNECT storage engine quickly as a replacement for the Federated engine, that is, to link table to a remote MySQL table. Basic instructions on how to set it up and use MySQL type looked simple, but when I tried to test on Fedora 23 and hit a problem of missing libodbc.so.1:

MariaDB [(none)]> INSTALL SONAME 'ha_connect';
ERROR 1126 (HY000): Can't open shared library '/home/openxs/dbs/mariadb10.1/lib/plugin/ha_connect.so'
  (errno: 2, libodbc.so.1: cannot open shared object file: No such file or directory)
the solution was not really straightforward. First of all I had to install unixODBC.x86_64 2.3.4-1.fc23 RPM, but it also does not provide libodbc.so.1:

[openxs@fc23 node2]$ find / -name libodbc.* 2>/dev/null
/usr/lib64/libodbc.so.2
/usr/lib64/libodbc.so
/usr/lib64/libodbc.so.2.0.0
So, I had to apply a quick and dirty hack:




[openxs@fc23 node2]$ sudo ln -s /usr/lib64/libodbc.so.2.0.0  /usr/lib64/libodbc.so.1
As a result CONNECT engine worked as expected, as long as proper account and IP-address where used:
MariaDB [test]> INSTALL SONAME 'ha_connect';
Query OK, 0 rows affected (0.27 sec)


MariaDB [test]> create table r_t2(id int primary key, c1 int) engine=connect table_type=mysql connection='mysql://msandbox:msandbox@127.0.0.1:23532/test/t';
Query OK, 0 rows affected (0.04 sec)

MariaDB [test]> select * from r_t2;                     
+----+------+
| id | c1   |
+----+------+
|  1 |    2 |
|  2 |    3 |
+----+------+
2 rows in set (0.00 sec)
From configuring MaxScale to work with database having underscore in the name to re-writing Java code that used to work with Oracle RDBMS for MySQL, with many missing details in the manuals and software bugs identified or reported in between, and all that with ongoing studies of performance problems and lost quorums, rolling upgrades and failed SSTs in Galera clusters - this is what support engineers here in MariaDB have to deal with during a typical working week.



Sunday, November 27, 2016

Upstart Basics for Automatic Restarts of MaxScale After Crash

Historically I do not care much about MariaDB's MaxScale, at least since I know how to build it from source when needed. But, as a support engineer who work at MariaDB, sometimes I have to deal with problems related to MaxScale, and this week it happened so that I had to explain what to do to implement automatic restarts of MaxScale "daemon" in case of crashes on RHEL 6.x.

In the process I had found out that two of my most often used Linux distributions, CentOS 6.x and Ubuntu 14.04, actually use Upstart, so good old System V's init tricks and scripts work there only partially and only because somebody cared to integrate them into this "new" approach to starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running. I had to do this years ago, but customer's question finally forced me to check and study some details on how this system actually works.

So, unfortunately, there is no script like mysqld_safe to start and restart MaxScale after installing official RPM from MariaDB (in this case it was maxscale-2.0.1-2.x86_64). My first idea was to write one, but then I asked myself why it is not there yet, decided to check what's already provided and take a closer look at /etc/init.d/maxscale I have on my CentOS 6.8 VM (the closest to what customer used). It's a typical classic shell script to start service and it starts MaxScale like this:

start() {
    echo -n $"Starting MaxScale: "
...
    ulimit -HSn 65535
    daemon --pidfile $MAXSCALE_PIDFILE /usr/bin/maxscale --user=maxscale $MAXSCALE_OPTIONS >& /dev/null

    RETVAL=$?
    [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$servicename
...
    # Return right code
    if [ $RETVAL -ne 0 ]; then
       failure
       RETVAL=$_RETVAL_NOT_RUNNING
    fi

    echo

    return $RETVAL
}
Basically, it runs /usr/bin/maxscale --user=maxscale, the rest are details (like location of PID file the rest of the script then relies on). There is no advanced script to monitor status of this process with this PID (like mysqld_safe) or anything to care about automatic restarts. It is supposed that I just execute chkconfig maxscale on and then service starts when system enters proper runlevel.

Simple test shows that when maxscale process is killed it's gone and is not restarted. At good old times I'd just add something like this:

mx:2345:respawn:/usr/bin/maxscale --user=maxscale

to /etc/inittab (as nice articles like this suggests), but quick check and then further reading proves that it's not going to work on CentOS 6.8, as it uses Upstart.

So, either I had to write something similar to mysqld_safe for MaxScale, or (having in mind size of code and number of bugs we had in the past in that script) I had better to find out what is the supposed way to respawn processes in Upstart. Basic ideas are, again, simple. One has to create /etc/init/service_name.conf file and put something like this there (real code quote from one of Ubuntu files for MySQL):
start on runlevel [2345]
stop on starting rc RUNLEVEL=[016]

respawn
respawn limit 2 5
...
pre-start script
...
end script

exec /usr/sbin/mysqld

post-start script
...
end script

The file is easy to understand even without reading the manual. One has to set when service starts and stops, add respawn clause if we want to restart it in case of unexpected crashes or kills of the process, optionally limit the number of restarts and intervals between restarts etc, and, optionally, do something before and after start.

I quickly created /etc/init/maxscale.conf based on the above and it did a great job in starting it automatically upon system startup. I've just used exec /usr/bin/maxscale --user=maxscale basically and decided to deal with options and other details later if needed. But what was strange for the very beginning is that in /var/log/messages I've seen what looked like repeated attempts to start maxscale process, that failed:

Nov 26 17:44:45 centos maxscale[20229]: MaxScale started with 1 server threads.
Nov 26 17:44:45 centos init: maxscale main process ended, respawning
Nov 26 17:44:45 centos maxscale[20229]: Started MaxScale log flusher.
Nov 26 17:44:45 centos maxscale[20235]: Working directory: /var/log/maxscale
Nov 26 17:44:45 centos maxscale[20235]: MariaDB MaxScale 2.0.1 started
Nov 26 17:44:45 centos maxscale[20235]: MaxScale is running in process 20235
...
Nov 26 17:44:45 centos maxscale[20235]: Loaded module qc_sqlite: V1.0.0 from /usr/lib64/maxscale/libqc_sqlite.so
Nov 26 17:44:45 centos maxscale[20235]: MaxScale is already running. Process id: 20229. Use another location for the PID file to run multiple instances of MaxScale on the same machine.
Nov 26 17:44:45 centos init: maxscale main process (20234) terminated with status 4
Nov 26 17:44:45 centos init: maxscale main process ended, respawning
Moreover, when proper maxscale process was killed, it was NOT respawned as expected.

It was just a proper time to read the manual more carefully, this part about expect stanza (that I noted in some of official Upstart scripts):

"To allow Upstart to determine the final process ID for a job, it needs to know how many times that process will call fork(2). Upstart itself cannot know the answer to this question since once a daemon is running, it could then fork a number of "worker" processes which could themselves fork any number of times. Upstart cannot be expected to know which PID is the "master" in this case, considering it does not know if worker processes will be created at all, let alone how many times, or how many times the process will fork initially. As such, it is necessary to tell Upstart which PID is the "master" or parent PID. This is achieved using the expect stanza.
The syntax is simple, but you do need to know how many times your service forks."
Let's check quickly how many times fork() is called in maxscale (I'd know it better if I ever cared to study the source code in details, but I had not checked most part of it yet). Test based on that cookbook gives unexpected results:

[root@centos ~]# strace -o /tmp/strace.log -fFv maxscale &
[1] 2369
[root@centos ~]# sleep 10
[root@centos ~]# ps aux | grep strace
root      2369  2.6  0.0   4476   888 pts/0    S    20:28   0:00 strace -o /tmp/strace.log -fFv maxscale
root      2382  0.0  0.0 103312   868 pts/0    S+   20:28   0:00 grep strace
[root@centos ~]# pkill -9 strace
[1]+  Killed                  strace -o /tmp/strace.log -fFv maxscale
[root@centos ~]# ps aux | grep maxscale
root      2375  1.3  0.2 276168  3896 ?        Ssl  20:28   0:00 maxscale
root      2385  0.0  0.0 103312   868 pts/0    S+   20:28   0:00 grep maxscale
[root@centos ~]# egrep "\<(fork|clone)\>\(" /tmp/strace.log | wc | awk '{print $1}'
5
How comes we have 5 fork calls? Here they are:
[root@centos ~]# egrep "\<(fork|clone)\>\(" /tmp/strace.log
2374  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb024c08ab0) = 2375
2375  clone(child_stack=0x7fb01f819f10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb01f81a9d0, tls=0x7fb01f81a700, child_tidptr=0x7fb01f81a9d0) = 2376
2375  clone(child_stack=0x7fb01e118f10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb01e1199d0, tls=0x7fb01e119700, child_tidptr=0x7fb01e1199d0) = 2377
2375  clone(child_stack=0x7fb01d10af10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb01d10b9d0, tls=0x7fb01d10b700, child_tidptr=0x7fb01d10b9d0) = 2378
2375  clone(child_stack=0x7fb017ffef10, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7fb017fff9d0, tls=0x7fb017fff700, child_tidptr=0x7fb017fff9d0) = 2379
It seems process with PID 2374 actually forked just once, to produce PID 2375, and then several threads were started by that process. We can see them:
[root@centos ~]# ps -T -p `pidof maxscale`
  PID  SPID TTY          TIME CMD
 2375  2375 ?        00:00:00 maxscale
 2375  2376 ?        00:00:00 maxscale
 2375  2377 ?        00:00:00 maxscale
 2375  2378 ?        00:00:00 maxscale
 2375  2379 ?        00:00:00 maxscale
[root@centos ~]#
So, it was really one fork() (and I could notify that even from studying /var/log/messages before) and I have to add expect fork stanza to my Upstart configuration file:

[root@centos ~]# cat /etc/init/maxscale.conf
# MaxScale service

description "MaxScale"

start on stopped rc RUNLEVEL=[2345]
stop on starting rc runlevel [!2345]

respawn
respawn limit 2 5

expect fork

exec /usr/bin/maxscale --user=maxscale
This way it works as expected, as one may easily check:

[root@centos ~]# initctl status maxscale
maxscale start/running, process 6600
[root@centos ~]# maxadmin
MaxScale> show servers
Server 0x19e47a0 (server1)
        Server:                              127.0.0.1
        Status:                              Master, Running
        Protocol:                            MySQLBackend
        Port:                                3306
        Server Version:                      5.7.15-9-log
        Node Id:                             1
        Master Id:                           -1
        Slave Ids:
        Repl Depth:                          0
        Number of connections:               0
        Current no. of conns:                0
        Current no. of operations:           0
MaxScale> quit
[root@centos ~]# kill -9 6600
[root@centos ~]# ps aux | grep maxscale
maxscale  6627  2.0  0.2 276168  3884 ?        Ssl  20:41   0:00 /usr/bin/maxscale --user=maxscale
root      6633  0.0  0.0 103312   872 pts/0    S+   20:41   0:00 grep maxscale
[root@centos ~]# initctl status maxscale
maxscale start/running, process 6627

In the /var/log/messages we clearly see that the process is respawned by init:

...
Nov 26 20:38:15 centos maxscale[6600]: Started MaxScale log flusher.
Nov 26 20:40:33 centos maxscale[6600]: Loaded module MaxAdminAuth: V2.0.0 from /usr/lib64/maxscale/libMaxAdminAuth.so
Nov 26 20:41:52 centos init: maxscale main process (6600) killed by KILL signal
Nov 26 20:41:52 centos init: maxscale main process ended, respawning
Nov 26 20:41:52 centos maxscale[6627]: Working directory: /var/log/maxscale
Nov 26 20:42:07 centos maxscale[6627]: Loaded module MaxAdminAuth: V2.0.0 from /usr/lib64/maxscale/libMaxAdminAuth.so
...
After few more checks I asked to implement this officially in packages for the Linux distributions that use Upstart, see MXS-1027.

To summarize, I wish I care more to find out how Upstart works long time ago. Now it's time to study systemd probably :) Anyway, after some reading and testing one can use it efficiently to provide automated service starts and restarts for MySQL server and services used with it.

Saturday, November 12, 2016

MySQL Support Engineer's Chronicles, Issue #3

The original idea of this series was to publish one post per week, but it seems every other week I have some special topic that well deserves a dedicated post. Last week I had no time to complete my writing because of long (and, I hope, useful) Howto post on replacing corrupted partition using non-corrupted one from other server in replication setup. But I had links and notes collected in a draft that I am going to complete now.

First of all, during the previous week I had time to submit two more talks for the  "MySQL and Friends Devroom" at FOSDEM 2017, "Applying profilers to MySQL" and "Background Story of a MySQL Bug". Call for papers is still open, as far as I understand, so I may come up with few more ideas on what to talk about.

Strange to admit this, but sometimes I can be excited with something MySQL-related. Two weeks ago I've added a note to myself about great, detailed changelogs that MariaDB publishes, with proper links to GitHub commits. Check this one for 10.0.28, for example. I wish Oracle provides the same level of details for MySQL releases as a part of their official release notes.

Still, sometimes important changes happen in upstream MySQL, get merged, and details about inherited incompatible change (and its real impact) are still missing in any release notes of any kind. Let's consider recent example. Historically MySQL treated incorrect utf8 bytes sequences differently for INSERT than for LOAD DATA. INSERT failed, LOAD DATA just had the value truncated on the first incorrect character and continued. Eventually (in MySQL 5.6.32) this was fixed by upstream MySQL (it was also fixed in a wider context in MariaDB 10.2 in frames of MDEV-9823). MySQL 5.6.32 release notes says about the incompatible change:
  • "Incompatible Change: For multibyte character sets, LOAD DATA could fail to allocate space correctly and ignore input rows as a result. (Bug #76237, Bug #20683959, Bug #23080148)
    References: This issue is a regression of: Bug #14653594."
But it surely says nothing about the impact for replication or another Bug #78758 that is fixed in 5.6.32 (by making LOAD DATA fail with error). It costed me some time to figure out all the details. Read MDEV-11217 for the historical details, nice test case and different views on the decisions made. Note also that now error message about the bad utf8 character from LOAD DATA looks (IMHO) weird enough, as it contains actually only a valid part of the string. See my MDEV-11216 about this.

I am still having fun with InnoDB locks. This week I checked what locks are set when the same DELETE (for the table with the primary key and unique secondary index) is executed twice in the same transaction. Check Bug #83640 and tell me how this sequence of locks set in one transaction may make any sense:
---TRANSACTION 636201, ACTIVE 202 sec
5 lock struct(s), heap size 1184, 4 row lock(s), undo log entries 1
MySQL thread id 1, OS thread handle 0x7f9e513a7700, query id 92 localhost root init
show engine innodb status
TABLE LOCK table `test`.`tu` trx id 636201 lock mode IX
RECORD LOCKS space id 11 page no 4 n bits 72 index `u` of table `test`.`tu` trx id 636201 lock_mode X locks rec but not gap
RECORD LOCKS space id 11 page no 3 n bits 72 index `PRIMARY` of table `test`.`tu` trx id 636201 lock_mode X locks rec but not gap
RECORD LOCKS space id 11 page no 4 n bits 72 index `u` of table `test`.`tu` trx id 636201 lock_mode X
RECORD LOCKS space id 11 page no 4 n bits 72 index `u` of table `test`.`tu` trx id 636201 lock_mode X locks gap before rec
My colleague Jan Lindström was also surprised, so we have MDEV-11215 as well, and a chance to see this studies and maybe changed by MariaDB engineers. Related problems were discussed in the past, see Bug #19762 and Bug #55717.

Some days I keep wondering what happens to XtraBackup these days in Percona. As far as I remember I was not able to reproduce lp:1461833 while working there, but the fact that the bug is still open and got no comments since I re-opened it gives me no additional confidence.

I report bugs and missing details in MySQL way too often even for my own liking. But MySQL manual really misses many details to explain results that users see in production. This week I'd like to remind about one of my bug reports about missing details in MySQL documentation, Bug #77390, and my request there:
"Please, explain all metadata and InnoDB locks set by online ALTER, with examples and details enough to explain non-trivial cases..."
Honestly, until this month I never noted that DDL log exists in MySQL. This is a ddl_log.log binary file that can be "dumped" into a somewhat readable form using a script by Mattias Jonsson from Bug #47343:
[openxs@fc23 5.7]$ perl ~/ddl_log_dump.pl data/ddl_log.log
Header: Num entries: 4 Name length: 512 Block length 4096
Entry 1 type i action s next 0
  name ./test/trange2
not 'e' entry (i)
Entry 2 type l action d next 0
  name ./test/#sql-trange2
not 'e' entry (l)
Entry 3 type l action d next 2
  name ./test/trange2#P#pmax
not 'e' entry (l)

This file may grow until MySQL server restart completes, but what's worse, when it grows over 4GB in size it becomes unusable and effectively blocks any further concurrent DDL until we get rid of it. I had a lot of fun reading the code and reporting related Bug #83708. Unlucky users who do a lot of partitioning-related DDL may find the situation less funny when they hit this bug.


I plan to describe what I had to work on this week soon, while I still remember all the relevant details and feelings. So, stay tuned!

Saturday, November 5, 2016

How to Recover Corrupted InnoDB Partition Tablespace in Replication Setup

This week I've got a question that sounded basically like this:
"Is it possible to just copy the entire partition from the replicated server?"
Let me share some background story. As it happens sometimes, user had a huge table with many partitions, let's say hundreds of gigabytes in size each, and one of them got unfortunately corrupted. It happened in a replication setup on master, but lucky they were, they had used innodb_file_per_table=1 and they had a slave that was more or less in sync with master. This allowed to reconfigure replication and continue to work, but the task remained to eventually put master back in use and get correct data in the corrupted partition. Let's assume that dumping and reloading data from one of instances in replication setup is not a desired option, as it will take too much time comparing to just copying the partition tablespace file. Hence the question above...
Side note: Let's assume for simplicity that corrupted partition does not get changes at the moment, but even if it does we can replay them theoretically (by careful processing of binary logs, for example).
My quick and simple answer was that surely it is possible with InnoDB from MySQL 5.6, and there is a great blog post that provide a lot of relevant details and steps, this one by my former colleague from Percona Jervin Real. I remember I had to deal with orphaned or corrupted partitions more than once in the past, while working for Percona. It is NOT possible to import them directly in MySQL 5.6, and I've even reported this as Bug #70196, but in 5.7.4+ after the fix for Bug #52422 it is possible. Anyway, nice partition exchange feature, as Jervin explained, helps to overcome this limitation easily in MySQL 5.6 as well.

User decided to try this, but surely dealing with corrupted InnoDB tables is not as easy as with non-corrupted ones. There many things that could go not as expected and require additional steps. For example, hardly any partition exchange is possible for a table with corrupted partition - I've suggested to do this for a clean table without any corruptions.

I still feel that things may go wrong, so I decided to make a test and show step by step how it works and what may go wrong in the process. To do this I've used one of replication sandboxes at hand, with MySQL 5.6.28 and, after making sure replication works on both slaves, created simple partitioned table on master with the following statements:
create table tp(id int, val int, ts datetime, primary key(id, ts)) partition by range (year(ts)) (partition p0 values less than (2006), partition p1 values less than (2015), partition px values less than maxvalue);
insert into tp values(1, 10, '2005-10-01');
insert into tp values(2, 10, '2014-10-02');
insert into tp values(3, 30, now());
This is what I've got on master:
master [localhost] {msandbox} (test) > select * from tp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   10 | 2005-10-01 00:00:00 |
|  2 |   10 | 2014-10-02 00:00:00 |
|  3 |   30 | 2016-11-05 17:34:30 |
+----+------+---------------------+
3 rows in set (0,00 sec)

master [localhost] {msandbox} (test) > select * from tp partition(px);
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  3 |   30 | 2016-11-05 17:34:30 |
+----+------+---------------------+
1 row in set (0,00 sec)

master [localhost] {msandbox} (test) > select * from tp partition(p0);
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   10 | 2005-10-01 00:00:00 |
+----+------+---------------------+
1 row in set (0,00 sec)

master [localhost] {msandbox} (test) > select * from tp partition(p1);
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  2 |   10 | 2014-10-02 00:00:00 |
+----+------+---------------------+
1 row in set (0,00 sec)

Then I checked that data are replicated on slave that I am going to corrupt soon: 
slave1 [localhost] {msandbox} (test) > select * from tp partition(p1);
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  2 |   10 | 2014-10-02 00:00:00 |
+----+------+---------------------+
1 row in set (0,00 sec)
It's time to stop MySQL server and corrupt the data in partition p1:
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ./stop
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ls -l data/test
total 152012
...

-rw-rw---- 1 openxs openxs     8610 лис  5 17:34 tp.frm
-rw-rw---- 1 openxs openxs       32 лис  5 17:34 tp.par
-rw-rw---- 1 openxs openxs    98304 лис  5 17:34 tp#P#p0.ibd
-rw-rw---- 1 openxs openxs    98304 лис  5 17:34 tp#P#p1.ibd
-rw-rw---- 1 openxs openxs    98304 лис  5 17:34 tp#P#px.ibd

openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ hexdump -C data/test/tp#P#p1.ibd | more
00000000  22 e2 b8 a3 00 00 00 00  00 00 00 00 00 00 00 00  |"...............|
00000010  00 00 00 00 0f f4 ae 86  00 08 00 00 00 00 00 00  |................|
00000020  00 00 00 00 00 0c 00 00  00 0c 00 00 00 00 00 00  |................|
00000030  00 06 00 00 00 40 00 00  00 00 00 00 00 04 00 00  |.....@..........|
00000040  00 00 ff ff ff ff 00 00  ff ff ff ff 00 00 00 00  |................|
00000050  00 01 00 00 00 00 00 9e  00 00 00 00 00 9e 00 00  |................|
...

openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ dd if=/dev/zero of=data/test/tp#P#p1.ibd bs=1 count=98304
98304+0 records in
98304+0 records out
98304 bytes (98 kB) copied, 0,253177 s, 388 kB/s
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ hexdump -C data/test/tp#P#p1.ibd | more
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00018000
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ls -l data/test/tp*       
-rw-rw---- 1 openxs openxs  8610 лис  5 17:34 data/test/tp.frm
-rw-rw---- 1 openxs openxs    32 лис  5 17:34 data/test/tp.par
-rw-rw---- 1 openxs openxs 98304 лис  5 17:34 data/test/tp#P#p0.ibd
-rw-rw---- 1 openxs openxs 98304 лис  5 17:41 data/test/tp#P#p1.ibd
-rw-rw---- 1 openxs openxs 98304 лис  5 17:34 data/test/tp#P#px.ibd
I just filled entire partition file with zeroes and verified it's content  before and after with hexdump. I think it's bad enough kid of corruption - there is no way to recover data, they are gone entirely, and system areas of .ibd file will surely not match what InnoDB data dictionary may thing about the tablespace.

Now I can try to start slave back and access data in the table:
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ./start
. sandbox server started
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ tail data/msandbox.err
2016-11-05 18:15:11 8092 [Note] Server hostname (bind-address): '127.0.0.1'; port: 22294
2016-11-05 18:15:11 8092 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2016-11-05 18:15:11 8092 [Note] Server socket created on IP: '127.0.0.1'.
2016-11-05 18:15:11 8092 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=mysql_sandbox22294-relay-bin' to avoid this problem.
2016-11-05 18:15:11 8092 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2016-11-05 18:15:11 8092 [Note] Slave I/O thread: connected to master 'rsandbox@127.0.0.1:22293',replication started in log 'mysql-bin.000005' at position 1015
2016-11-05 18:15:11 8092 [Note] Event Scheduler: Loaded 0 events
2016-11-05 18:15:11 8092 [Note] /home/openxs/5.6.28/bin/mysqld: ready for connections.
Version: '5.6.28-log'  socket: '/tmp/mysql_sandbox22294.sock'  port: 22294  MySQL Community Server (GPL)
2016-11-05 18:15:12 8092 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000005' at position 1015, relay log './mysql_sandbox22294-relay-bin.000236' position: 1178
The table was not accessed during recovery stage (as slave was stopped cleanly), so we have no hints in the error log about the corruption. But as soon as we try to access the table:

slave1 [localhost] {msandbox} (test) > select * from tp;
ERROR 2013 (HY000): Lost connection to MySQL server during query
In the error log I see:

InnoDB: Error: tablespace id is 12 in the data dictionary
InnoDB: but in file ./test/tp#P#p1.ibd it is 0!
2016-11-05 18:17:03 7f0e9c257700  InnoDB: Assertion failure in thread 139700725970688 in file fil0fil.cc line 796
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
16:17:03 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=1
max_threads=151
thread_count=3
connection_count=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68108 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x276e860
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f0e9c256e50 thread_stack 0x40000
/home/openxs/5.6.28/bin/mysqld(my_print_stacktrace+0x35)[0x90f695]
/home/openxs/5.6.28/bin/mysqld(handle_fatal_signal+0x3d8)[0x674fc8]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f0eb0d99330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f0eaf993c37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f0eaf997028]
/home/openxs/5.6.28/bin/mysqld[0xae994e]
/home/openxs/5.6.28/bin/mysqld[0xae9ade]
/home/openxs/5.6.28/bin/mysqld[0xaf0419]
/home/openxs/5.6.28/bin/mysqld[0xabb7fb]
/home/openxs/5.6.28/bin/mysqld[0xabbe6b]
/home/openxs/5.6.28/bin/mysqld[0xaa996a]
/home/openxs/5.6.28/bin/mysqld[0xa9720f]
/home/openxs/5.6.28/bin/mysqld[0xa40b2e]
/home/openxs/5.6.28/bin/mysqld[0x99fca5]
/home/openxs/5.6.28/bin/mysqld[0x997d89]
/home/openxs/5.6.28/bin/mysqld[0x99f9d9]
/home/openxs/5.6.28/bin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x9c)[0x58c52c]/home/openxs/5.6.28/bin/mysqld(_ZN12ha_partition8rnd_nextEPh+0x41)[0xb48441]
/home/openxs/5.6.28/bin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x64)[0x58c4f4]
/home/openxs/5.6.28/bin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x37)[0x8400a7]
/home/openxs/5.6.28/bin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x181)[0x6d3891]
/home/openxs/5.6.28/bin/mysqld(_ZN4JOIN4execEv+0x391)[0x6d16b1]
/home/openxs/5.6.28/bin/mysqld[0x718349]
/home/openxs/5.6.28/bin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_P10SQL_I_ListI8st_orderESB_S7_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xbc)[0x718e0c]
/home/openxs/5.6.28/bin/mysqld(_Z13handle_selectP3THDP13select_resultm+0x175)[0x719015]
/home/openxs/5.6.28/bin/mysqld[0x6f3769]
/home/openxs/5.6.28/bin/mysqld(_Z21mysql_execute_commandP3THD+0x3575)[0x6f7f25]
/home/openxs/5.6.28/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x338)[0x6fba58]
/home/openxs/5.6.28/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xc60)[0x6fce30]
/home/openxs/5.6.28/bin/mysqld(_Z10do_commandP3THD+0xd7)[0x6fec27]
/home/openxs/5.6.28/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x116)[0x6c5da6]
/home/openxs/5.6.28/bin/mysqld(handle_one_connection+0x45)[0x6c5e85]
/home/openxs/5.6.28/bin/mysqld(pfs_spawn_thread+0x126)[0x989be6]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f0eb0d91184]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f0eafa5737d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f0e74005050): select * from tp
Connection ID (thread ID): 3
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
161105 18:17:04 mysqld_safe Number of processes running now: 0
161105 18:17:04 mysqld_safe mysqld restarted
So, assertion failure happened because tablespace id of deliberately corrupted partition does not match InnoDB expectations in the data dictionary. Server was restart, but this time it was not successful:

...
2016-11-05 18:17:04 8146 [Note] InnoDB: Database was not shutdown normally!
2016-11-05 18:17:04 8146 [Note] InnoDB: Starting crash recovery.
2016-11-05 18:17:04 8146 [Note] InnoDB: Reading tablespace information from the .ibd files...
2016-11-05 18:17:04 8146 [ERROR] InnoDB: space header page consists of zero bytes in tablespace ./test/tp#P#p1.ibd (table test/tp#P#p1)
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size:1024 Pages to analyze:64
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size: 1024, Possible space_id count:0
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size:2048 Pages to analyze:48
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size: 2048, Possible space_id count:0
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size:4096 Pages to analyze:24
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size: 4096, Possible space_id count:0
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size:8192 Pages to analyze:12
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size: 8192, Possible space_id count:0
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size:16384 Pages to analyze:6
2016-11-05 18:17:04 8146 [Note] InnoDB: Page size: 16384, Possible space_id count:0
2016-11-05 18:17:04 7f0f0f745780  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
InnoDB: Error: could not open single-table tablespace file ./test/tp#P#p1.ibd
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.
161105 18:17:04 mysqld_safe mysqld from pid file /home/openxs/sandboxes/rsandbox_mysql-5_6_28/node1/data/mysql_sandbox22294.pid ended
Now, what shell we do? There is no reason to try to start again and again. We should check that partition .ibd file is there (it is), does have proper permissions (it does) and looks like a a proper tablespace. We can check that if we have proper tablespace file for the same partition elsewhere, search for known strings of data in the file etc. In this case hexdump already shown corruption of data without any chance to restore anything. Time to ask a question that started this post...

With the answer in mind (yes, we can get the data from partition .ibd file from non-corrupted table), how to proceed in practice? First of all, we need MySQL server that is started and works, so that we can run SQL statements. This one can not be started normally and, if we plan to restore data on it (for whatever reason) we have to start it first. To do so we have to try to use innodb_force_recovery settings starting for 1. Depending on how and when corruption happened 1 may not be enough. In general I would not recommend to proceed to any value more than 4. Also it makes sense to copy all files related to corrupted table and, ideally, shared tabespace to a safe place. Changes done with forced recovery may corrupt data even more, so it's good to have a way to roll back and start again. I copied all .ibd files for the table to /tmp directory, but in realirty make sure you just rename/move them inside the same physical filesystem, if you care about huge size and time to copy:
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ cp data/test/tp*.ibd /tmp/
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ls -l /tmp/tp*.ibd
-rw-rw---- 1 openxs openxs 98304 лис  5 18:49 /tmp/tp#P#p0.ibd
-rw-rw---- 1 openxs openxs 98304 лис  5 18:49 /tmp/tp#P#p1.ibd
-rw-rw---- 1 openxs openxs 98304 лис  5 18:49 /tmp/tp#P#px.ibd
Then I set innodb_force_recovery to 1 in [mysqld] section of my configuration file and started the server:

openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ grep innodb_force my.sandbox.cnf
innodb_force_recovery=1
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ./start
. sandbox server started
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ tail data/msandbox.err    
InnoDB: innodb_force_recovery is on: we do not allow
InnoDB: database modifications by the user. Shut down
InnoDB: mysqld and edit my.cnf so that
InnoDB: innodb_force_... is removed.
2016-11-05 18:25:25 8491 [ERROR] Error writing relay log configuration.
2016-11-05 18:25:25 8491 [ERROR] Error reading relay log configuration.
2016-11-05 18:25:25 8491 [ERROR] Failed to initialize the master info structure
2016-11-05 18:25:25 8491 [Note] Check error log for additional messages. You will not be able to start replication until the issue is resolved and the server restarted.
2016-11-05 18:25:25 8491 [Note] Event Scheduler: Loaded 0 events
2016-11-05 18:25:25 8491 [Note] /home/openxs/5.6.28/bin/mysqld: ready for connections.
Version: '5.6.28-log'  socket: '/tmp/mysql_sandbox22294.sock'  port: 22294  MySQL Community Server (GPL)
It seems server started, but when we try to access the table strange thing happens:
slave1 [localhost] {msandbox} (test) > select * from tp;
ERROR 1146 (42S02): Table 'test.tp' doesn't exist
slave1 [localhost] {msandbox} (test) > show tables;
+----------------+
| Tables_in_test |
+----------------+
| t              |
| t1             |
| t2             |
| t_local        |
| tp             |
+----------------+
5 rows in set (0,00 sec)

slave1 [localhost] {msandbox} (test) > show create table tp\G
ERROR 1146 (42S02): Table 'test.tp' doesn't exist
The table is NOT there and there at the same time. This indicates some problem with InnoDB data dictionary. In the error log we see:
2016-11-05 18:26:14 8491 [ERROR] InnoDB: Failed to find tablespace for table '"test"."tp" /* Partition "p1" */' in the cache. Attempting to load the tablespace with space id 12.
2016-11-05 18:26:14 8491 [ERROR] InnoDB: In file './test/tp#P#p1.ibd', tablespace id and flags are 0 and 0, but in the InnoDB data dictionary they are 12 and 0.
Have you moved InnoDB .ibd files around without using the commands DISCARD TABLESPACE and IMPORT TABLESPACE? Please refer to http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2016-11-05 18:26:14 7f434af03700  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
2016-11-05 18:26:14 8491 [ERROR] InnoDB: Could not find a valid tablespace file for 'test/tp#P#p1'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2016-11-05 18:26:14 7f434af03700 InnoDB: cannot calculate statistics for table "test"."tp" /* Partition "p1" */ because the .ibd file is missing. For help, please refer to http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html
2016-11-05 18:26:18 7f434af03700 InnoDB: cannot calculate statistics for table "test"."tp" /* Partition "p1" */ because the .ibd file is missing. For help, please refer to http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html
2016-11-05 18:26:51 7f434af03700 InnoDB: cannot calculate statistics for table "test"."tp" /* Partition "p1" */ because the .ibd file is missing. For help, please refer to http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting.html
Forced recovery mode is designed to be used for one of two main actions with the problematic table: either to select some remaining data from it (we tried and can not do this, and we don't case as we have data on other server in replication setup), or to drop it (we also copied .ibd files elsewhere if you remember). Nothing works with the table, but DROP does work:
slave1 [localhost] {msandbox} (test) > drop table tp;
Query OK, 0 rows affected (0,32 sec)
Now we can try to restart without forced recovery and try to recreate the table and link orphaned .ibd files back:

openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ./stop
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ vi my.sandbox.cnf
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ./start. sandbox server started
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ grep innodb_force my.sandbox.cnf
#innodb_force_recovery=1
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ tail data/msandbox.err   
2016-11-05 18:51:50 8861 [Note] Server hostname (bind-address): '127.0.0.1'; port: 22294
2016-11-05 18:51:50 8861 [Note]   - '127.0.0.1' resolves to '127.0.0.1';
2016-11-05 18:51:50 8861 [Note] Server socket created on IP: '127.0.0.1'.
2016-11-05 18:51:50 8861 [Warning] Neither --relay-log nor --relay-log-index were used; so replication may break when this MySQL server acts as a slave and has his hostname changed!! Please use '--relay-log=mysql_sandbox22294-relay-bin' to avoid this problem.
2016-11-05 18:51:50 8861 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2016-11-05 18:51:50 8861 [Note] Slave I/O thread: connected to master 'rsandbox@127.0.0.1:22293',replication started in log 'mysql-bin.000005' at position 1015
2016-11-05 18:51:50 8861 [Note] Event Scheduler: Loaded 0 events
2016-11-05 18:51:50 8861 [Note] /home/openxs/5.6.28/bin/mysqld: ready for connections.
Version: '5.6.28-log'  socket: '/tmp/mysql_sandbox22294.sock'  port: 22294  MySQL Community Server (GPL)
2016-11-05 18:51:51 8861 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000005' at position 1015, relay log './mysql_sandbox22294-relay-bin.000237' position: 4
It looks promising so far. We can get DDL for the table from other server in replication setup (or any other source, like older dump) and try to create the table in a hurry:
master [localhost] {msandbox} (test) > show create table tp\G
*************************** 1. row ***************************
       Table: tp
Create Table: CREATE TABLE `tp` (
  `id` int(11) NOT NULL DEFAULT '0',
  `val` int(11) DEFAULT NULL,
  `ts` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`id`,`ts`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (year(ts))
(PARTITION p0 VALUES LESS THAN (2006) ENGINE = InnoDB,
 PARTITION p1 VALUES LESS THAN (2015) ENGINE = InnoDB,
 PARTITION px VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0,00 sec)

slave1 [localhost] {msandbox} (test) > show tables;
+----------------+
| Tables_in_test |
+----------------+
| t              |
| t1             |
| t2             |
| t_local        |
+----------------+
4 rows in set (0,00 sec)

slave1 [localhost] {msandbox} (test) > CREATE TABLE `tp` (   `id` int(11) NOT NULL DEFAULT '0',   `val` int(11) DEFAULT NULL,   `ts` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',   PRIMARY KEY (`id`,`ts`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 /*!50100 PARTITION BY RANGE (year(ts)) (PARTITION p0 VALUES LESS THAN (2006) ENGINE = InnoDB,  PARTITION p1 VALUES LESS THAN (2015) ENGINE = InnoDB,  PARTITION px VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
ERROR 1813 (HY000): Tablespace for table '`test`.`tp` /* Partition `p1` */' exists. Please DISCARD the tablespace before IMPORT.
So, looks like DROP TABLE worked but corrupted partition file remained? Yes, it remained:

openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ ls -l data/test/tp*.ibd 
-rw-rw---- 1 openxs openxs 98304 лис  5 17:41 data/test/tp#P#p1.ibd
We have to remove it and then try again:

openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$ rm data/test/tp*.ibd

slave1 [localhost] {msandbox} (test) > CREATE TABLE `tp` (   `id` int(11) NOT NULL DEFAULT '0',   `val` int(11) DEFAULT NULL,   `ts` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',   PRIMARY KEY (`id`,`ts`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1 /*!50100 PARTITION BY RANGE (year(ts)) (PARTITION p0 VALUES LESS THAN (2006) ENGINE = InnoDB,  PARTITION p1 VALUES LESS THAN (2015) ENGINE = InnoDB,  PARTITION px VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;
Query OK, 0 rows affected (0,75 sec)
Now we can mostly follow Jervin's blog post and create non-partitioned table of the same structure with the following DDL statements:

create table tp_tmp like tp;
alter table tp_tmp remove partitioning;alter table tp_tmp discard tablespace;
At this moment we can re-import partitions one by one. In reality instead of copy we may use rename or move for those partitions that were non-corrupted.
slave1 [localhost] {msandbox} (test) > \! cp /tmp/tp#P#p0.ibd data/test/tp_tmp.ibd
slave1 [localhost] {msandbox} (test) > \! ls -l data/test/tp_tmp.*             
-rw-rw---- 1 openxs openxs  8610 лис  5 19:05 data/test/tp_tmp.frm
-rw-rw---- 1 openxs openxs 98304 лис  5 19:07 data/test/tp_tmp.ibd
slave1 [localhost] {msandbox} (test) > alter table tp_tmp import tablespace;
Query OK, 0 rows affected, 1 warning (0,34 sec)

slave1 [localhost] {msandbox} (test) > show warnings\G
*************************** 1. row ***************************
  Level: Warning
   Code: 1810
Message: InnoDB: IO Read error: (2, No such file or directory) Error opening './test/tp_tmp.cfg', will attempt to import without schema verification
1 row in set (0,00 sec)

slave1 [localhost] {msandbox} (test) > select * from tp_tmp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   10 | 2005-10-01 00:00:00 |
+----+------+---------------------+
1 row in set (0,00 sec)
We get the warning, but it can be safely ignored - we know for sure that table structure matches .ibd file imported. Now we are ready to exchange emprty partiton with proper partiton data in the table:

slave1 [localhost] {msandbox} (test) > alter table tp exchange partition p0 with table tp_tmp;
Query OK, 0 rows affected (0,51 sec)

slave1 [localhost] {msandbox} (test) > select * from tp_tmp;
Empty set (0,00 sec)

slave1 [localhost] {msandbox} (test) > select * from tp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   10 | 2005-10-01 00:00:00 |
+----+------+---------------------+
1 row in set (0,01 sec)
Then we can do the same for other partition(s) stored on the same server, all but corrupted one:
slave1 [localhost] {msandbox} (test) > alter table tp_tmp discard tablespace;
Query OK, 0 rows affected (0,05 sec)

slave1 [localhost] {msandbox} (test) > \! cp /tmp/tp#P#px.ibd data/test/tp_tmp.ibd
slave1 [localhost] {msandbox} (test) > alter table tp_tmp import tablespace;
Query OK, 0 rows affected, 1 warning (0,35 sec)

slave1 [localhost] {msandbox} (test) > select * from tp_tmp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  3 |   30 | 2016-11-05 17:34:30 |
+----+------+---------------------+
1 row in set (0,00 sec)

slave1 [localhost] {msandbox} (test) > alter table tp exchange partition px with table tp_tmp;Query OK, 0 rows affected (0,53 sec)

slave1 [localhost] {msandbox} (test) > select * from tp_tmp;                   
Empty set (0,00 sec)

slave1 [localhost] {msandbox} (test) > select * from tp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   10 | 2005-10-01 00:00:00 |
|  3 |   30 | 2016-11-05 17:34:30 |
+----+------+---------------------+
2 rows in set (0,00 sec)
The file for corrupted partititon we should copy form elsewhere, in my case it's a datadir on master sandbox instance:
slave1 [localhost] {msandbox} (test) > alter table tp_tmp discard tablespace;  
Query OK, 0 rows affected (0,04 sec)

slave1 [localhost] {msandbox} (test) > \! cp ../master/data/test/tp#P#p1.ibd data/test/tp_tmp.ibd
slave1 [localhost] {msandbox} (test) > alter table tp_tmp import tablespace;   
Query OK, 0 rows affected, 1 warning (0,36 sec)

slave1 [localhost] {msandbox} (test) > select * from tp_tmp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  2 |   10 | 2014-10-02 00:00:00 |
+----+------+---------------------+
1 row in set (0,00 sec)

slave1 [localhost] {msandbox} (test) > alter table tp exchange partition px with table tp_tmp;
ERROR 1737 (HY000): Found a row that does not match the partition
slave1 [localhost] {msandbox} (test) > alter table tp exchange partition p1 with table tp_tmp;
Query OK, 0 rows affected (0,47 sec)

slave1 [localhost] {msandbox} (test) > select * from tp_tmp;
Empty set (0,00 sec)

slave1 [localhost] {msandbox} (test) > select * from tp partition(p1);
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  2 |   10 | 2014-10-02 00:00:00 |
+----+------+---------------------+
1 row in set (0,01 sec)
As you can see from my small mistake above, if data imported do not match range partition condition, they are not imported and you get error. We are done with restoring the data and now we can checkif replication works. Let me change all the data on master:
master [localhost] {msandbox} (test) > update tp set val = val + 1;
Query OK, 3 rows affected (0,03 sec)
Rows matched: 3  Changed: 3  Warnings: 0

master [localhost] {msandbox} (test) > select * from tp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   11 | 2005-10-01 00:00:00 |
|  2 |   11 | 2014-10-02 00:00:00 |
|  3 |   31 | 2016-11-05 17:34:30 |
+----+------+---------------------+
3 rows in set (0,00 sec)
and make sure they are replicated to slave:
slave1 [localhost] {msandbox} (test) > select * from tp;
+----+------+---------------------+
| id | val  | ts                  |
+----+------+---------------------+
|  1 |   11 | 2005-10-01 00:00:00 |
|  2 |   11 | 2014-10-02 00:00:00 |
|  3 |   31 | 2016-11-05 17:34:30 |
+----+------+---------------------+
3 rows in set (0,00 sec)

In the error log we'll see only messages related to importing tablespaces, no errors of any kind:
...
2016-11-05 19:01:02 9201 [ERROR] InnoDB: The file './test/tp#P#p1.ibd' already exists though the corresponding table did not exist in the InnoDB data dictionary. Have you moved InnoDB .ibd files around without using the SQL commands DISCARD TABLESPACE and IMPORT TABLESPACE, or did mysqld crash in the middle of CREATE TABLE? You can resolve the problem by removing the file './test/tp#P#p1.ibd' under the 'datadir' of MySQL.
2016-11-05 19:07:58 9201 [Note] InnoDB: Sync to disk
2016-11-05 19:07:58 9201 [Note] InnoDB: Sync to disk - done!
2016-11-05 19:07:58 9201 [Note] InnoDB: Phase I - Update all pages
2016-11-05 19:07:58 9201 [Note] InnoDB: Sync to disk
2016-11-05 19:07:58 9201 [Note] InnoDB: Sync to disk - done!
2016-11-05 19:07:58 9201 [Note] InnoDB: Phase III - Flush changes to disk
2016-11-05 19:07:58 9201 [Note] InnoDB: Phase IV - Flush complete
2016-11-05 19:10:48 9201 [Note] InnoDB: Sync to disk
2016-11-05 19:10:48 9201 [Note] InnoDB: Sync to disk - done!
2016-11-05 19:10:48 9201 [Note] InnoDB: Phase I - Update all pages
2016-11-05 19:10:48 9201 [Note] InnoDB: Sync to disk
2016-11-05 19:10:48 9201 [Note] InnoDB: Sync to disk - done!
2016-11-05 19:10:48 9201 [Note] InnoDB: Phase III - Flush changes to disk
2016-11-05 19:10:49 9201 [Note] InnoDB: Phase IV - Flush complete
2016-11-05 19:12:42 9201 [Note] InnoDB: Sync to disk
2016-11-05 19:12:42 9201 [Note] InnoDB: Sync to disk - done!
2016-11-05 19:12:42 9201 [Note] InnoDB: Phase I - Update all pages
2016-11-05 19:12:42 9201 [Note] InnoDB: Sync to disk
2016-11-05 19:12:42 9201 [Note] InnoDB: Sync to disk - done!
2016-11-05 19:12:42 9201 [Note] InnoDB: Phase III - Flush changes to disk
2016-11-05 19:12:42 9201 [Note] InnoDB: Phase IV - Flush complete
openxs@ao756:~/sandboxes/rsandbox_mysql-5_6_28/node1$
I am too tired to write a nice summary right now, so I'll write it in some other post. Now I (and you, my readers) have a detailed enough reference of all boring steps involved in recovery of corrupted partition from .ibd file taken from other server in replication setup.