NUST-RMB2013 Database

Character Sets

 The NUST-RMB2013 Database is collected from daily-used RMB (renminbi bank note, the paper currency used in China) images by the Department of Computer Science, Nanjing University of Science and Technology during 2012-2013. The database represents all the different classes of RMB serial number characters. Provided by gray-scaled character images with gray level from 0 to 255, it consists of a training set of approximately 500 samples for each of the 35 categories (numeral 0-9 and alphabet A-Z except V) and a testing set of 200 samples for each class, respectively. The data samples have been normalized and centered in 36x60 images.

Data Collection

 We collected the RMB database using two versions (1999 and 2005) of RMB with the denomination of 100 yuans. The contact image sensor (CIS) installed in the money counting machine scanned the RMB currencies with a resolution of 200x180 dpi, and produced see-through RMB images. We located the RMB serial number region by the apriori knowledge of serial numbers' size and location, and then manually chose the complete and human readable extraction results, labeled their categories and assigned them to the training and test datasets. The uneven illumination, contrast variation, smear, various pattern background (including complex texture and little anti-counterfeiting circles) makes the character hard to be recognized and also increases the challenge of our database.

A comprehensive description of the databases has been published at Pattern Recognition (download the paper).

Samples of NUST-RMB2013 dataset

Challenging samples of NUST-RMB2013 dataset

Download Application

 The database is free for academic research under an agreement (download the agreement). For usage, please read the agreement carefully, fill in the application form and send back by email (scanned PDF).

Recommendations of usage

 This database can be utilized for design and evaluation of character recognition algorithms and classifiers.