
简介
该用户还未填写简介
擅长的技术栈
可提供的服务
暂无可提供的服务
1.概要金融领域交通领域医疗领域零售领域工业领域农业领域司法领域物流领域2.详情待续
1. 读取原始数据html = urlopen(url).read()2. 数据清洗raw = nltk.clean_html(html)3. 数据切片raw = raw[111:2222222]4. 数据分词tokens = nltk.wordpunct_tokenize(raw)5. 分词切片tokens = tokens[2
#!/usr/bin/python# -*- coding: utf-8 -*-'''Created on 2015-1-7@author: beyondzhou@name: fetch_users_information.py'''# Fetch user informationdef fetch_users_information():# import
1. 频率分析from prettytable import PrettyTablefrom collections import Counterfor label, data in (('Word', words),('Screen Name', screen_names),('Hashtag', hashtags)):pt = PrettyTable(field_names=
#!/usr/bin/python# -*- coding: utf-8 -*-'''Created on 2015-1-4@author: beyondzhou@name: find_popular_entities.py'''# Extract entities of sina weibodef find_popular_entities():# im
新浪微博数据挖掘食谱之六: 元素篇 (提取微博元素)
新浪微博数据挖掘菜谱之二: 话题篇 (selenium)
#!/usr/bin/python# -*- coding: utf-8 -*-'''Created on 2015-1-10@author: beyondzhou@name: analyze_friends_followers.py'''# Analyze user's friends and followersdef analyze_friends_followers()
#!/usr/bin/python# -*- coding: utf-8 -*-'''Created on 2015-1-9@author: beyondzhou@name: harvest_users_weibo.py'''# Harvest users weibodef harvest_users_weibo():# importimpor
#!/usr/bin/python# -*- coding: utf-8 -*-'''Created on 2015-1-3@author: beyondzhou@name: find_popular_weibo.py'''# Find popular weibodef find_popular_weibo():# importfrom log