redis范围查询应用-查找IP所在城市

矮蛋蛋

浏览: 79512 次
性别:
来自: 北京

最近访客更多访客>>

u010444992

MagicalRice

wang7793439

eyesraining

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

redis

redis

原文地址：http://www.tuicool.com/articles/BrURbqV
需求

根据IP找到对应的城市

原来的解决方案

oracle表（ip_country）：

查询IP对应的城市：

1.把a.b.c.d这样格式的IP转为一个数字，例如为把210.21.224.34转为3524648994

2. select city from ip_country where ipstartdigital <= 3524648994 and 3524648994 <=ipenddigital

redis解决方案

我们先把上面的表简化一下：

id city min max
1 P1 0 100
2 P2 101 200
3 P3 201 300
4 P4 301 400
（注意：min/max组成的range之间不能有重叠）

主要思路就是用hmset存储表的每一行，并为每一行建立一个id（作为key）

然后把ip_end按顺序从小到大存储在sorted set当中，score对应该行的id

查询时，利用redis sorted set的范围查询特性，从sorted set中查询到id，再根据id去hmget

实验

//存储表的每一行
127.0.0.1:6379> hmset {ip}:1 city P1 min 0 max 100
OK
127.0.0.1:6379> hmset {ip}:2 city P2 min 101 max 200
OK
127.0.0.1:6379> hmset {ip}:3 city P3 min 201 max 300
OK
127.0.0.1:6379> hmset {ip}:4 city P4 min 301 max 400
OK

//建立sorted set（member-score）
127.0.0.1:6379> zadd {ip}:end.asc 100 1 200 2 300 3 400 4
(integer) 4
127.0.0.1:6379> zrange {ip}:end.asc 0 -1
1) "1"
2) "2"
3) "3"
4) "4"

//查询对应的区间（score）
127.0.0.1:6379> zrangebyscore {ip}:end.asc 90 +inf LIMIT 0 1
1) "1"
127.0.0.1:6379> zrangebyscore {ip}:end.asc 123 +inf LIMIT 0 1
1) "2"
127.0.0.1:6379> zrangebyscore {ip}:end.asc 100 +inf LIMIT 0 1
1) "1"
//解释：
//zrangebyscore {ip}:end.asc 90 +inf LIMIT 0 1
//表示查找大于等于90的第一个值。（+inf在Redis中表示正无穷大）
//该语句返回值score=1，与hmset当中的id对应，因此可以通过hmget查找城市了：

//查找城市
127.0.0.1:6379> hmget {ip}:1 city
1) "P1"
注意在设计redis key时，采用了统一的前缀：{ip}

这是为了使得这些IP相关的数据都落在同一台redis server中（我们的redis以集群形式部署且采取一致性哈希），往后数据迁移什么的会更方便

实操

从数据库中导出的得到的文本是这样的（选取几行为例子）：

ipcountry_tab_orderby_end_asc.txt：

"IPSTART" "IPSTARTDIGITAL" "IPEND" "IPENDDIGITAL" "COUNTRY" "CITY" "TYPE" "REGISTRY" "ADRESS" "PROVINCE"
"1.184.0.0" 28835840 "1.184.127.255" 28868607 "中国" "广州市" "" "" "" "广东省"
"1.184.128.0" 28868608 "1.184.255.255" 28901375 "中国" "广州市" "" "" "" "广东省"
"1.185.0.0" 28901376 "1.185.95.255" 28925951 "中国" "南宁市" "" "" "" "广西省"
1.生成批量的hmset命令及zadd命令

写个小程序来生成：

import java.io.File;
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.List;

import org.apache.commons.io.FileUtils;
import org.apache.commons.lang3.StringUtils;

public class IpCountryRedisImport {

public static void main(String[] args) throws IOException {
    File file = new File("E:/doc/ipcountry_tab_orderby_end_asc.txt");
    File hmsetFile = new File("E:/doc/ip_country_redis_cmd.txt");
    File zaddFile = new File("E:/doc/ip_country_redis_zadd.txt");

    List<String> lines = FileUtils.readLines(file);
    int i = 0;
    StringBuilder rows = new StringBuilder();
    StringBuilder ends = new StringBuilder();
    for (String str : lines) {
      if (StringUtils.isEmpty(str)) {
        continue;
      }

      //skip first line
      if (i == 0) {
        i++;
        continue;
      }

      i++;

      //"IPSTART" "IPSTARTDIGITAL" "IPEND" "IPENDDIGITAL" "COUNTRY" "CITY" "TYPE" "REGISTRY" "ADRESS" "PROVINCE"
      //0               1                2         3              4          5       6         7         8            9
      String[] parts = str.split("\t");
      String start = parts[1];
      String end = parts[3];
      String country = parts[4];
      String city = parts[5];
      String type = parts[6];
      String registry = parts[7];
      String address = parts[8];
      String province = parts[9];

      //String cmd = "hmset {ip}:" + (i++) + " start " + start + " end " + end + " country " + country + " city " + city + " type " + type + " registry " + registry + " address " + address + " province " + province;

      rows.append("*18\r\n");

      rows.append(format("hmset"));

      rows.append(format("{ip}:" + i));

      rows.append(format("start"));
      rows.append(format(start));

      rows.append(format("end"));
      rows.append(format(end));

      rows.append(format("country"));
      rows.append(format(country));

      rows.append(format("city"));
      rows.append(format(city));

      rows.append(format("type"));
      rows.append(format(type));

      rows.append(format("registry"));
      rows.append(format(registry));

      rows.append(format("address"));
      rows.append(format(address));

      rows.append(format("province"));
      rows.append(format(province));


      //zadd {ip}:end.asc 1234 1
      ends.append("*4\r\n");
      ends.append(format("zadd"));
      ends.append(format("{ip}:end.asc"));
      ends.append(format(end));
      ends.append(format("" + i));

    }
    FileUtils.writeStringToFile(hmsetFile, rows.toString(), "UTF-8");
    FileUtils.writeStringToFile(zaddFile, ends.toString(), "UTF-8");
    System.out.println(1);
}

private static String format(String value) throws UnsupportedEncodingException {
    String trimValue = value.replace("\"", "");
    return "$" + trimValue.getBytes("UTF-8").length+ "\r\n" + trimValue + "\r\n";
}

}
需要注意的是，format方法里面，值的长度不是字符串的长度，而是字符串转化为字节之后的长度

生成hmset结果举例（ip_country_redis_cmd.txt，每一行都是以\r\n结尾）：

*18
$5
hmset
$8
{ip}:645
$5
start
$8
28835840
$3
end
$8
28868607
$7
country
$6
中国
$4
city
$9
广州市
$4
type
$0

$8
registry
$0

$7
address
$0

$8
province
$9
广东省
生成的zadd命令举例（ip_country_redis_zadd.txt）：

*4
$4
zadd
$12
{ip}:end.asc
$8
16777471
$1
2
需要注意的是，txt文件通过SecureCRT上传到linux后，\r\n可能就只剩\n了，可以替换一下：

perl -pi -e 's/\n/\r\n/' ip_country_redis_cmd.txt
perl -pi -e 's/\n/\r\n/' ip_country_redis_zadd.txt
2.导入redis

文件生成完毕后，执行以下命令导入：

cat ip_country_redis_cmd.txt | redis-cli –pipe
cat ip_country_redis_zadd.txt | redis-cli --pipe
40万行的数据，花费时间不到一分钟，redis的mass insertion还是很强大的

在这里要提一下的是，redis文档中关于批量导入的说明可能会有误导：

文档是这样的：

SET Key0 Value0
SET Key1 Value1
...
SET KeyN ValueN
我刚开始以为像上面那样，只要把批量redis命令写在同一个文本文件，然后直接导入就可以了：

cat cmd.txt | redis-cli –pipe

实际上不是的，要符合redis protocol才可以

protocol语法：

*<args><cr><lf>
$<len><cr><lf>
<arg0><cr><lf>
<arg1><cr><lf>
...
<argN><cr><lf>
举例：

*3<cr><lf>
$3<cr><lf>
SET<cr><lf>
$3<cr><lf>
key<cr><lf>
$5<cr><lf>
value<cr><lf>
说明：

*后面的数字表示该条redis命令有多少参数，

例如：

set ab 1234参数个数是3

hmset name google.com 1 baidu.com 2的参数个数是6

接下来就是命令的每一部分（空格分隔），先是长度，后是值：

以“set ab 1234”为例：

set的长度是3，ab的长度是2，1234的长度是4，因此最终内容为：

*3
$3
set
$2
ab
$4
1234
注意每一行都是以<cr><lf>（也就是\r\n）结尾

3.查询

使用spring redis

关键代码：

long min = ip; //转换成数字的IP
      long max = Long.MAX_VALUE;
      long offset = 0;
      long count = 1;
      Set<String> result = redisTemplate.opsForZSet().rangeByScore(zSetName, min, max, offset, count);

final String ipKey = redisIprowPrefix + score;

        String city = redisTemplate.execute(new RedisCallback<String>(){

          @Override
          public String doInRedis(RedisConnection connection)
              throws DataAccessException {
            byte[] key = redisTemplate.getStringSerializer().serialize(
                          ipKey);
                  if (connection.exists(key)) {
                      List<byte[]> value = connection.hMGet(
                              key,
                              redisTemplate.getStringSerializer().serialize(
                                      "city")
                              );
                      String city = redisTemplate.getStringSerializer()
                              .deserialize(value.get(0));
                      return city;
                  }
                  return null;
          }

        });
redisTemplate需要配置序列化相关的property：

<bean id="redisTemplate" class="org.springframework.data.redis.core.RedisTemplate"
    p:connection-factory-ref="jedisConnFactory">
    <property name="valueSerializer">
      <bean class="org.springframework.data.redis.serializer.StringRedisSerializer" />
    </property>
    <property name="keySerializer">
      <bean class="org.springframework.data.redis.serializer.StringRedisSerializer" />
    </property>
</bean>

分享到：