Email: service@parnassusdata.com 7 x 24 online support!

MySQL UnDROP tool for InnoDB

MySQL UnDROP tool for InnoDB

TwinDB data recovery toolkit is a set of tools that work with InnoDB tablespaces at low level.
 
Incredible Performance of stream_parser
stream_parser is a tool that finds InnoDB pages in stream of bytes. It can be either file such as ibdata1, *.ibd or raw partition.
stream_parser runs as many parallel workers as number of CPUs in the system.  The performance of stream_parser is amazing! Compare how stream_parser outperforms page_parser on a four-CPU virtual machine running on
 
my laptop:
 
 
 
# ./page_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G
Opening file: /dev/mapper/vg_twindbdev-lv_root
...
Size to process:               19327352832 (18.000 GiB)
1.00% done. 2014-06-23 03:03:48 ETA(in 00:18 hours). Processing speed: 17570320 B/sec
2.00% done. 2014-06-23 03:05:27 ETA(in 00:19 hours). Processing speed: 16106127 B/sec
3.00% done. 2014-06-23 03:02:11 ETA(in 00:16 hours). Processing speed: 19327352 B/sec
4.00% done. 2014-06-23 03:03:48 ETA(in 00:17 hours). Processing speed: 17570320 B/sec
...
 
 
So, it takes almost 20 minutes to parse 18G partition.
 
 
Let’s check stream_parser
 
 
# ./stream_parser -f /dev/mapper/vg_twindbdev-lv_root -t 18G
 
...
Size to process:               19327352832 (18.000 GiB)
Worker(0): 1.91% done. 2014-06-23 02:51:41 ETA(in 00:00:56). Processing speed: 79.906 MiB/sec
Worker(2): 1.74% done. 2014-06-23 02:51:47 ETA(in 00:01:02). Processing speed: 72.000 MiB/sec
Worker(3): 3.30% done. 2014-06-23 02:51:15 ETA(in 00:00:30). Processing speed: 144.000 MiB/sec
Worker(1): 1.21% done. 2014-06-23 02:52:20 ETA(in 00:01:35). Processing speed: 47.906 MiB/sec
Worker(2): 5.38% done. 2014-06-23 02:51:11 ETA(in 00:00:25). Processing speed: 168.000 MiB/sec
Worker(3): 9.72% done. 2014-06-23 02:51:00 ETA(in 00:00:14). Processing speed: 296.000 MiB/sec
...
Worker(0): 88.91% done. 2014-06-23 02:52:06 ETA(in 00:00:02). Processing speed: 191.625 MiB/sec
Worker(0): 93.42% done. 2014-06-23 02:52:06 ETA(in 00:00:01). Processing speed: 207.644 MiB/sec
Worker(0): 97.40% done. 2014-06-23 02:52:06 ETA(in 00:00:00). Processing speed: 183.641 MiB/sec
All workers finished in 31 sec
 
 
So, 18 minutes versus 31 seconds. 34 times faster! Impressive, isn’t it?
 
c_parser Improvements
 
 
c_parser is a tool that reads InnoDB page or many pages, extracts records and stores them in tab-separated values dumps. InnoDB page with user data doesn’t store information about table structure. You should tell c_parser what fields you’re looking for. Command line option -t specifies a file with CREATE TABLE statement.
 
This is how it works. Here’s the CREATE statement (I took it from mysqldump)
 
# cat sakila/actor.sql
CREATE TABLE `actor` (
  `actor_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
  `first_name` varchar(45) NOT NULL,
  `last_name` varchar(45) NOT NULL,
  `last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`actor_id`),
  KEY `idx_actor_last_name` (`last_name`)
) ENGINE=InnoDB AUTO_INCREMENT=201 DEFAULT CHARSET=utf8;
And now let’s fetch records of table actor from InnoDB pages:
 
# ./c_parser -6f pages-actor.ibd/FIL_PAGE_INDEX/0000000000001828.page -t sakila/actor.sql
-- Page id: 3, Format: COMPACT, Records list: Valid, Expected records: (200 200)
000000005313    970000013C0110  actor   1       "PENELOPE"      "GUINESS"       "2006-02-15 04:34:33"
000000005313    970000013C011B  actor   2       "NICK"  "WAHLBERG"      "2006-02-15 04:34:33"
000000005313    970000013C0126  actor   3       "ED"    "CHASE" "2006-02-15 04:34:33"
...
000000005313    970000013C09D8  actor   199     "JULIA" "FAWCETT"       "2006-02-15 04:34:33"
000000005313    970000013C09E4  actor   200     "THORA" "TEMPLE"        "2006-02-15 04:34:33"
 
 
-- Page id: 3, Found records: 200, Lost records: NO, Leaf page: YES
 
 
The version 5.6 of MySQL introduced few format changes. Most of them were already supported. The c_parser fixes on top of that some bugs in processing temporal fields.
 
The new UnDROP tool for InnoDB is still no reason not to take backups :-), but at least you can be armed better if the inevitable happens.
 
How to Recover Table Structure
MySQL stores table structure in a respective .frm file. When the table is dropped the .frm file is gone. Fortunately InnoDB stores copy of the structure in the dictionary. sys_parser is a tool that can read the dictionary and generate CREATE TABLE statement. Check how you can Recover Table Structure From InnoDB Dictionary.
 
How to Install TwinDB Data Recovery Toolkit
Check out the source code from LaunchPAD:
 
# $ bzr branch lp:undrop-for-innodb
Branched 33 revisions.
Or you can download an archive with the latest revision from download page.
 
Compile the source code. But first install dependencies: make, gcc, flex, bison.
 
root@twindb-dev undrop-for-innodb]# make
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c stream_parser.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include  -pthread -lm  stream_parser.o -o stream_parser
flex  sql_parser.l
bison  -o sql_parser.c sql_parser.y
sql_parser.y: conflicts: 6 shift/reduce
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c sql_parser.c
lex.yy.c:3078: warning: ‘yyunput’ defined but not used
lex.yy.c:3119: warning: ‘input’ defined but not used
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c c_parser.c
./include/ctype-latin1.c:359: warning: ‘my_mb_wc_latin1’ defined but not used
./include/ctype-latin1.c:372: warning: ‘my_wc_mb_latin1’ defined but not used
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c tables_dict.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c print_data.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe  -I./include -c check_data.c
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe   -I./include  sql_parser.o c_parser.o tables_dict.o print_data.o check_data.o -o c_parser -pthread -lm
cc -D_FILE_OFFSET_BITS=64 -Wall -g -O3 -pipe   -I./include -o innochecksum_changer innochecksum.c
[root@twindb-dev undrop-for-innodb]#
UPDATE:
 
The toolkit is tested on following systems:
 
CentOS release 5.10 (Final) x86_64
CentOS release 6.5 (Final) x86_64
CentOS Linux release 7.0.1406 (Core) x86_64
Fedora release 20 (Heisenbug) x86_64
Ubuntu 10.04.4 LTS (lucid) x86_64
Ubuntu 12.04.4 LTS (precise) x86_64
Ubuntu 14.04 LTS (trusty) x86_64
Debian GNU/Linux 7.5 (wheezy) x86_64
32 bit operating systems are not supported
 
 
 
 
An InnoDB index doesn’t carry information about the table structure indeed. MySQL keeps the structure in .frm files and InnoDB stores the structure in the dictionary. When the table structure isn’t available from external source (old backup, installation script etc) then possible way to recover the structure are:
 
1) Recover from .frm files. There are some tools around available . I prefer to create a dummy table, replace the .frm file and run SHOW CREATE TABLE. This option however is useless when DROP TABLE happens, MySQL deletes the .frm file as well.
 
2) Recover the structure from the InnoDB dictionary. InnoDB stores almost all necessary information about the table structure in the dictionary. When a user runs DROP TABLE the respective records are deleted from the dictionary tables, so when recover the dictionary tables you need to specify -D option to c_parser (-D recovers records that are marked as deleted). The tables you need are SYS_TABLES, SYS_INDEXES, SYS_FIELDS and SYS_COLUMNS. Then load everything into a live instance of MySQL. A tool sys_parser from the toolkit reads SYS_* tables from MySQL and generates CREATE TABLE statement.