本文共 3206 字,大约阅读时间需要 10 分钟。
分布式爬虫框架的部分功能已经开发的差不多了,现在要开始考虑接入kafka了.
kafka依赖于zookeeper, 所以同时需要安装zookeeper
dockerfile-kafka如下
FROM java:8u111-jdkMAINTAINER stcoderRUN apt-get updateRUN apt-get -y install wget tar supervisorWORKDIR /tmpRUN wget http://mirrors.tuna.tsinghua.edu.cn/apache/kafka/1.0.0/kafka_2.11-1.0.0.tgzRUN mkdir /home/kafkaRUN tar zxvf kafka_2.11-1.0.0.tgz -C /homeADD supervisord.conf /etc/supervisord.confEXPOSE 2181 2888 3888 9092CMD ["/usr/bin/supervisord", "-c", "/etc/supervisord.conf"]
这里要说个东西了-----supervisor,我在这个容器里启动了两个服务,zookeeper与kafka,这就是依靠supervisor实现的,下面是supervisor.conf
[supervisord]nodaemon=true[program:zookeeper]command=/home/kafka_2.11-1.0.0/bin/zookeeper-server-start.sh /home/kafka_2.11-1.0.0/config/zookeeper.properties[program:kafka]command=/home/kafka_2.11-1.0.0/bin/kafka-server-start.sh /home/kafka_2.11-1.0.0/config/server.properties接下来是docker-compose.yml
version: '2'services: kafka: restart: always build: context: . dockerfile: Dockerfile-kafka volumes: # 用于将自定义的配置挂载进入容器 - ./conf/server.properties:/home/kafka_2.11-1.0.0/config/server.properties:ro - ./conf/zookeeper.properties:/home/kafka_2.11-1.0.0/config/zookeeper.properties:ro这里将同级目录下的conf文件夹里的配置文件挂载到容器的卷,从而实现自定义配置
到这里为止,一个单机版本的kafka容器已经构建完成了,接下来就是进行测试了,我们用flask构建一个web服务,用户请求/test时向topic名为mytopic的topic发送一个负载为hello的消息,test-app.py如下
from flask import Flask, requestfrom confluent_kafka import Producerapp = Flask(__name__)kafka_conf = {'bootstrap.servers': 'kafka:9092'}@app.route('/test')def test(): p = Producer(kafka_conf) p.produce('mytopic', 'hello') p.flush() return 'ok'if __name__ == '__main__': app.run('0.0.0.0')然后构建这个webserver的dockerfile-app如下
FROM ubuntu:16.04MAINTAINER stcoderRUN apt-get updateRUN apt-get -y install python3 python3-dev python3-pip \wget software-properties-common python-software-propertiesRUN wget -qO - http://packages.confluent.io/deb/4.0/archive.key | apt-key add -RUN add-apt-repository "deb [arch=amd64] http://packages.confluent.io/deb/4.0 stable main"RUN apt-get update && apt-get -y install librdkafka-devADD requirements.txt /tmp/requirements.txtWORKDIR /tmpRUN pip3 install --upgrade pipRUN pip3 install -r requirements.txtWORKDIR /homeRUN mkdir kafka-docker-testWORKDIR kafka-docker-testCMD python3 test-app.py然后整个测试系统的构建的docker-compose.yml如下
version: '2'services: kafka: restart: always build: context: . dockerfile: Dockerfile-kafka volumes: # 用于将自定义的配置挂载进入容器 - ./conf/server.properties:/home/kafka_2.11-1.0.0/config/server.properties:ro - ./conf/zookeeper.properties:/home/kafka_2.11-1.0.0/config/zookeeper.properties:ro app: restart: always build: context: . dockerfile: Dockerfile-app volumes: - ./:/home/kafka-docker-test:ro ports: - "5000:5000" links: - kafka整个测试项目的目录结构如下
.├── conf│ ├── server.properties│ └── zookeeper.properties├── docker-compose.yml├── Dockerfile-app├── Dockerfile-kafka├── requirements.txt├── supervisord.conf└── test-app.py启动docker-compose.yml up后访问127.0.0.1:5000/test 页面返回ok
使用docker exec -it <kafka容器> /bin/bash
进入kafka根目录,
执行bin/kafka-topics.sh --list --zookeeper 127.0.0.1:2181
mytopic显示mytopic创建成功了,测试到此为止,kafka容器可用
转载地址:http://iigmi.baihongyu.com/