CodeX
Published in

CodeX

Fuzzy query for CipherColumn | ShardingSphere 5.3.0 Deep Dive

1. Background

2. How to achieve fuzzy query in encrypted scenarios?

2.1 Load data to the in-memory database (IMDB)

2.2 Implement encryption & decryption functions consistent with database programs

Native SQL: select * from user where name like "%xxx%" 
After implementing the decryption function: ѕеlесt * frоm uѕеr whеrе dесоdе(namе) lіkе "%ххх%"

2.3 Store after data masking

For example, mobile number 13012345678 becomes 130****5678 after the masking algorithm is performed.

2.4 Perform encrypted storage after tokenization and combination

2.5 Single-character digest algorithm (default fuzzy query algorithm provided in ShardingSphere version 5.3.0)

3. Fuzzy query algorithm

public final class CharDigestLikeEncryptAlgorithm implements LikeEncryptAlgorithm<Object, String> {

private static final String DELTA = "delta";

private static final String MASK = "mask";

private static final String START = "start";

private static final String DICT = "dict";

private static final int DEFAULT_DELTA = 1;

private static final int DEFAULT_MASK = 0b1111_0111_1101;

private static final int DEFAULT_START = 0x4e00;

private static final int MAX_NUMERIC_LETTER_CHAR = 255;

@Getter
private Properties props;

private int delta;

private int mask;

private int start;

private Map<Character, Integer> charIndexes;

@Override
public void init(final Properties props) {
this.props = props;
delta = createDelta(props);
mask = createMask(props);
start = createStart(props);
charIndexes = createCharIndexes(props);
}

private int createDelta(final Properties props) {
if (props.containsKey(DELTA)) {
String delta = props.getProperty(DELTA);
try {
return Integer.parseInt(delta);
} catch (NumberFormatException ex) {
throw new EncryptAlgorithmInitializationException("CHAR_DIGEST_LIKE", "delta can only be a decimal number");
}
}
return DEFAULT_DELTA;
}

private int createMask(final Properties props) {
if (props.containsKey(MASK)) {
String mask = props.getProperty(MASK);
try {
return Integer.parseInt(mask);
} catch (NumberFormatException ex) {
throw new EncryptAlgorithmInitializationException("CHAR_DIGEST_LIKE", "mask can only be a decimal number");
}
}
return DEFAULT_MASK;
}

private int createStart(final Properties props) {
if (props.containsKey(START)) {
String start = props.getProperty(START);
try {
return Integer.parseInt(start);
} catch (NumberFormatException ex) {
throw new EncryptAlgorithmInitializationException("CHAR_DIGEST_LIKE", "start can only be a decimal number");
}
}
return DEFAULT_START;
}

private Map<Character, Integer> createCharIndexes(final Properties props) {
String dictContent = props.containsKey(DICT) && !Strings.isNullOrEmpty(props.getProperty(DICT)) ? props.getProperty(DICT) : initDefaultDict();
Map<Character, Integer> result = new HashMap<>(dictContent.length(), 1);
for (int index = 0; index < dictContent.length(); index++) {
result.put(dictContent.charAt(index), index);
}
return result;
}

@SneakyThrows
private String initDefaultDict() {
InputStream inputStream = CharDigestLikeEncryptAlgorithm.class.getClassLoader().getResourceAsStream("algorithm/like/common_chinese_character.dict");
LineProcessor<String> lineProcessor = new LineProcessor<String>() {

private final StringBuilder builder = new StringBuilder();

@Override
public boolean processLine(final String line) {
if (line.startsWith("#") || 0 == line.length()) {
return true;
} else {
builder.append(line);
return false;
}
}

@Override
public String getResult() {
return builder.toString();
}
};
return CharStreams.readLines(new InputStreamReader(inputStream, Charsets.UTF_8), lineProcessor);
}

@Override
public String encrypt(final Object plainValue, final EncryptContext encryptContext) {
return null == plainValue ? null : digest(String.valueOf(plainValue));
}

private String digest(final String plainValue) {
StringBuilder result = new StringBuilder(plainValue.length());
for (char each : plainValue.toCharArray()) {
char maskedChar = getMaskedChar(each);
if ('%' == maskedChar) {
result.append(each);
} else {
result.append(maskedChar);
}
}
return result.toString();
}

private char getMaskedChar(final char originalChar) {
if ('%' == originalChar) {
return originalChar;
}
if (originalChar <= MAX_NUMERIC_LETTER_CHAR) {
return (char) ((originalChar + delta) & mask);
}
if (charIndexes.containsKey(originalChar)) {
return (char) (((charIndexes.get(originalChar) + delta) & mask) + start);
}
return (char) (((originalChar + delta) & mask) + start);
}

@Override
public String getType() {
return "CHAR_DIGEST_LIKE";
}
}

4. The fuzzy algorithm development progress

4.1 The first edition

Mask: 0b11111111111001111101
The original character: 0b1000101110101111讯
After encryption: 0b1000101000101101設
Assuming we know the key and encryption algorithm, the original string after a backward pass is:1.0b1000101100101101 謭
2.0b1000101100101111 謯
3.0b1000101110101101 训
4.0b1000101110101111 讯
5.0b1000101010101101 読
6.0b1000101010101111 誯
7.0b1000101000101111 訯
8.0b1000101000101101 設

4.2 The second edition

4.3 The third edition

5. How to use fuzzy query

dataSources:
ds_0:
dataSourceClassName: com.zaxxer.hikari.HikariDataSource
driverClassName: com.mysql.jdbc.Driver
jdbcUrl: jdbc:mysql://127.0.0.1:3306/test?allowPublicKeyRetrieval=true
username: root
password: root

rules:
- !ENCRYPT
encryptors:
like_encryptor:
type: CHAR_DIGEST_LIKE
aes_encryptor:
type: AES
props:
aes-key-value: 123456abc
tables:
user:
columns:
name:
cipherColumn: name
encryptorName: aes_encryptor
assistedQueryColumn: name_ext
assistedQueryEncryptorName: aes_encryptor
likeQueryColumn: name_like
likeQueryEncryptorName: like_encryptor
phone:
cipherColumn: phone
encryptorName: aes_encryptor
likeQueryColumn: phone_like
likeQueryEncryptorName: like_encryptor
queryWithCipherColumn: true


props:
sql-show: true
Logic SQL: insert into user ( id, name, phone, sex) values ( 1, '熊高祥', '13012345678', '男')
Actual SQL: ds_0 ::: insert into user ( id, name, name_ext, name_like, phone, phone_like, sex) values (1, 'gyVPLyhIzDIZaWDwTl3n4g==', 'gyVPLyhIzDIZaWDwTl3n4g==', '佹堝偀', 'qEmE7xRzW0d7EotlOAt6ww==', '04101454589', '男')
Logic SQL: update user set name = '熊高祥123', sex = '男1' where sex ='男' and phone like '130%'
Actual SQL: ds_0 ::: update user set name = 'K22HjufsPPy4rrf4PD046A==', name_ext = 'K22HjufsPPy4rrf4PD046A==', name_like = '佹堝偀014', sex = '男1' where sex ='男' and phone_like like '041%'
Logic SQL: select * from user where (id = 1 or phone = '13012345678') and name like '熊%'
Actual SQL: ds_0 ::: select `user`.`id`, `user`.`name` AS `name`, `user`.`sex`, `user`.`phone` AS `phone`, `user`.`create_time` from user where (id = 1 or phone = 'qEmE7xRzW0d7EotlOAt6ww==') and name_like like '佹%'
Logic SQL: select * from user LEFT JOIN user_ext on user.id=user_ext.id where user.id in (select id from user where sex = '男' and name like '熊%')
Actual SQL: ds_0 ::: select `user`.`id`, `user`.`name` AS `name`, `user`.`sex`, `user`.`phone` AS `phone`, `user`.`create_time`, `user_ext`.`id`, `user_ext`.`address` from user LEFT JOIN user_ext on user.id=user_ext.id where user.id in (select id from user where sex = '男' and name_like like '佹%')
Logic SQL: delete from user where sex = '男' and name like '熊%'
Actual SQL: ds_0 ::: delete from user where sex = '男' and name_like like '佹%'

Links

Author

--

--

Everything connected with Tech & Code. Follow to join our 1M+ monthly readers

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Apache ShardingSphere

Transform any DBMS into a distributed database system & enhance it with sharding, elastic scaling features & more. https://linktr.ee/ApacheShardingSphere