Wednesday, July 31, 2013

Daily Bookmarks 20130731

两个或N个字符串最大公共子串算法 - tianmo2010的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/tianmo2010/article/details/7473717

Sunday, July 28, 2013

Daily Bookmarks 20130728

Mozilla Firefox 開始頁
http://www.renren.com/268217599
收件匣 (9,207) - peicheng5@gmail.com - Gmail
https://mail.google.com/mail/u/0/?shva=1#inbox
http://www.g.cn/
http://www.g.cn/
新分頁
about:newtab
Facebook
https://www.facebook.com/
pymmesg - Google 搜尋
https://www.google.com.tw/search?q=pymmesg&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&gws_rd=cr
pymmseg - Google 搜尋
https://www.google.com.tw/search?client=firefox-a&hs=dz8&rls=org.mozilla:zh-TW:official&q=pymmseg&spell=1&sa=X&ei=NPfzUYHCOY6bkgWYroCgBg&ved=0CC4QvwUoAA&biw=1275&bih=725
pluskid/pymmseg-cpp
https://github.com/pluskid/pymmseg-cpp
pymmseg-cpp - Google 搜尋
https://www.google.com.tw/search?q=pymmseg-cpp&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&channel=rcs&gws_rd=cr
Python 中文分词:用纯python实现 / FMM 算法 / pymmseg-cpp / smallseg / judou 句读 / BECer-GAE
http://www.starming.com/index.php?action=plugin&v=wave&tpl=union&ac=viewgrouppost&gid=73&tid=13336
python 中文分词,安装 pymmseg - zhangxinrun的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/zhangxinrun/article/details/7525740
youngking/pymmseg
https://github.com/youngking/pymmseg
pymmseg-cpp - High performance Chinese word segmenting module for Python - Google Project Hosting
http://code.google.com/p/pymmseg-cpp/
改进Pymmseg分词功能 - frEefiS ' tHiNkinG
http://freefis.appspot.com/?p=111001
pymmseg-cpp - High performance Chinese word segmenting module for Python - Google Project Hosting
http://code.google.com/p/pymmseg-cpp/
pymmseg-cpp/demos/use_custom_dict.py at master · shuge/pymmseg-cpp
https://github.com/shuge/pymmseg-cpp/blob/master/demos/use_custom_dict.py
python 中文分词,安装 pymmseg - python - ITeye技术网站
http://ipython.iteye.com/blog/1136931
使用pymmseg进行中文分词 - 地瓜日记 - 博客园
http://www.cnblogs.com/sweetpotato-diary/archive/2012/03/20/2408941.html
python下的两个分词工具 | 旁门左道
http://log.medcl.net/item/2011/03/python%E4%B8%8B%E7%9A%84%E5%88%86%E8%AF%8D%E5%BA%93/
longest common subsequence spam detect - Google 搜尋
https://www.google.com.tw/search?q=longest+common+subsequence+spam+detect&client=firefox-a&hs=UC9&rls=org.mozilla:zh-TW:official&ei=UfrzUYHVMciGkgXT8IGoDg&start=10&sa=N&biw=1275&bih=725
新分頁
about:newtab
pymmseg-cpp pip - Google 搜尋
https://www.google.com.tw/search?client=firefox-a&hs=ico&rls=org.mozilla%3Azh-TW%3Aofficial&q=pymmseg-cpp+pip&oq=pymmseg-cpp+pip&gs_l=serp.3...1709.3855.0.4117.4.4.0.0.0.0.93.351.4.4.0....0...1c.1.22.serp..3.1.93.iL5Fw559ofk
http://autodaguo-python.googlecode.com/svn/trunk/mybot.txt
http://autodaguo-python.googlecode.com/svn/trunk/mybot.txt
python list modules - Google 搜尋
https://www.google.com.tw/search?q=python+list+modules&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&gws_rd=cr
Get a list of installed Python modules - Stack Overflow
http://stackoverflow.com/questions/739993/get-a-list-of-installed-python-modules
新酷音 dict - Google 搜尋
https://www.google.com.tw/search?q=%E6%96%B0%E9%85%B7%E9%9F%B3+dict&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&gws_rd=cr
Re: [閒聊] 新酷音可不可以不要有內建詞彙 - 看板 IME - 批踢踢實業坊
http://www.ptt.cc/bbs/IME/M.1241690936.A.43A.html
http://svn.openfoundry.org/libchewingdata/readme.html
http://svn.openfoundry.org/libchewingdata/readme.html
新酷音共享詞庫
http://hyperrate.com/thread.php?tid=21020
pymmseg-cpp 繁體 - Google 搜尋
https://www.google.com.tw/search?q=pymmseg-cpp+%E7%B9%81%E9%AB%94&client=firefox-a&hs=Kh9&rls=org.mozilla:zh-TW:official&ei=yQH0UfKDDMKrkAWaoIDoCw&start=10&sa=N&biw=1275&bih=725
中文分词实战与文言文分词的初步设想 | 京華煙云
http://www.yenching.org/2009/10/%e4%b8%ad%e6%96%87%e5%88%86%e8%af%8d%e5%ae%9e%e6%88%98%e4%b8%8e%e6%96%87%e8%a8%80%e6%96%87%e5%88%86%e8%af%8d%e7%9a%84%e5%88%9d%e6%ad%a5%e8%ae%be%e6%83%b3/
Free Mind » Blog Archive » RMMSeg: Ruby 实现中文分词
http://lifegoo.pluskid.org/?p=261
新酷音 字典 - Google 搜尋
https://www.google.com.tw/search?q=%E6%96%B0%E9%85%B7%E9%9F%B3+%E5%AD%97%E5%85%B8&client=firefox-a&hs=g8o&rls=org.mozilla:zh-TW:official&ei=ZgP0Ub6UEIWokQXIqYHgBQ&start=10&sa=N&biw=1275&bih=700
TWed2k - 心得教學區 - [發現]新酷音注音修改教學
http://058176049149.ctinets.com/viewthread.php?action=printable&tid=290870
新酷音詞庫及注音修改教學
http://chewing.csie.net/chewing_dict_edit.html
新酷音詞庫及注音修改教學
http://chewing.csie.net/chewing_dict_edit.html
libchewing-data/utf-8/tsi.src at master · chewing/libchewing-data
https://github.com/chewing/libchewing-data/blob/master/utf-8/tsi.src
pluskid/pymmseg-cpp
https://github.com/pluskid/pymmseg-cpp
grep 非 打頭 - Google 搜尋
https://www.google.com.tw/search?q=grep+%E9%9D%9E+%E6%89%93%E9%A0%AD&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&gws_rd=cr
正則運算式之道 - just do it - 中國經濟網 經濟部落格
http://big5.ce.cn/gate/big5/blog.ce.cn/html/33/100933-55717.html
高鐵 - Yahoo!奇摩新聞搜尋結果
http://tw.news.search.yahoo.com/search;_ylt=A8tUwYGHB_RRyk8AoElr1gt.?p=%E9%AB%98%E9%90%B5&fr=ush-globalnews&fr2=piv-web
北高1,630元 高鐵最快10月調漲 - Yahoo!奇摩新聞
http://tw.news.yahoo.com/%E5%8C%97%E9%AB%981-630%E5%85%83-%E9%AB%98%E9%90%B5%E6%9C%80%E5%BF%AB10%E6%9C%88%E8%AA%BF%E6%BC%B2-213000245.html
新詞發現 最常共同子串 - Google 搜尋
https://www.google.com.tw/search?q=%E6%96%B0%E8%A9%9E%E7%99%BC%E7%8F%BE+%E6%9C%80%E5%B8%B8%E5%85%B1%E5%90%8C%E5%AD%90%E4%B8%B2&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&gws_rd=cr
基于大规模语料的新词发现算法
http://www.programmer.com.cn/12276/
LCS 新詞發現 - Google 搜尋
https://www.google.com.tw/search?q=LCS+%E6%96%B0%E8%A9%9E%E7%99%BC%E7%8F%BE&client=firefox-a&hs=LWp&rls=org.mozilla:zh-TW:official&ei=IAn0UayAHYnQkgWSkYEw&start=10&sa=N&biw=1275&bih=700
基于大规模语料的新词发现算法 - - 博客频道 - CSDN.NET
http://blog.csdn.net/qyee16/article/details/7741975
基于选择倾向性的词汇获取方法_百度文库
http://wenku.baidu.com/view/3d091d65783e0912a2162a24.html
Longest common subsequence 大規模 - Google 搜尋
https://www.google.com.tw/search?client=firefox-a&hs=JC&rls=org.mozilla%3Azh-TW%3Aofficial&channel=rcs&q=Longest+common+subsequence+%E5%A4%A7%E8%A6%8F%E6%A8%A1&oq=Longest+common+subsequence+%E5%A4%A7%E8%A6%8F%E6%A8%A1&gs_l=serp.3...1801.7287.0.7595.30.21.5.0.0.1.162.2057.15j6.21.0....0...1c.1.22.serp..25.5.210.1E4DjYTcN7o
基于大规模语料的新词发现算法 - - 博客频道 - CSDN.NET
http://blog.csdn.net/qyee16/article/details/7741975
http://sewm.pku.edu.cn/TianwangLiterature/Report/NCIS_TR_2007012.pdf
http://sewm.pku.edu.cn/TianwangLiterature/Report/NCIS_TR_2007012.pdf
抽取 公共子串 - Google 搜尋
https://www.google.com.tw/search?q=%E6%8A%BD%E5%8F%96+%E5%85%AC%E5%85%B1%E5%AD%90%E4%B8%B2&client=firefox-a&rls=org.mozilla:zh-TW:official&ei=Jwz0UbnUHpCmkgWB74GYDQ&start=60&sa=N&biw=1275&bih=700
求多个字符串的最大公共子串---后缀数组 - gdp5211314的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/gdp5211314/article/details/8362678
从diff到LCS(Longestcommonsubsequence),抽象之美-python-电脑编程网
http://biancheng.dnbcw.info/python/170358.html
[coreseek/sphinx学习笔记4]--搜索 - iLovePHP - 开源中国社区
http://my.oschina.net/wzwitblog/blog/109997
相似数据检测算法
http://www.douban.com/note/180296814/
Karp-Rabin - Google 搜尋
https://www.google.com.tw/search?q=Karp-Rabin&lr=lang_zh-CN%7Clang_zh-TW&client=firefox-a&hs=d1U&rls=org.mozilla:zh-TW:official&channel=rcs&tbs=lr:lang_1zh-CN%7Clang_1zh-TW&ei=sQv0UZbsN8flkAWXqIEo&start=10&sa=N&biw=1275&bih=700
Karp-Rabin algorithm
http://www-igm.univ-mlv.fr/~lecroq/string/node5.html
sequential extraction of common substrings - Google 搜尋
https://www.google.com.tw/search?q=sequential+extraction+of+common+substrings&ie=utf-8&oe=utf-8&rls=org.mozilla:zh-TW:official&client=firefox-a&channel=rcs&gws_rd=cr
基于统计的无词典的高频词抽取(二)——根据LCP数组计算词频 - 三度空间 - 博客园
http://www.cnblogs.com/three-zone/p/LCP.html
基于统计的无词典的高频词抽取(一)——后缀数组字典序排序 - 脚本百事通
http://www.csdn123.com/html/blogs/20130614/22454.htm
抽取 共子串 - Google 搜尋
https://www.google.com.tw/search?q=%E6%8A%BD%E5%8F%96+%E5%85%B1%E5%AD%90%E4%B8%B2&client=firefox-a&rls=org.mozilla:zh-TW:official&ei=eA_0Uf-JO8iXkwWW1ICABw&start=10&sa=N&biw=1275&bih=700
http://ir.dlut.edu.cn/ThesisList%5C2009%5C韩冰-大规模文本去重策略研究.pdf
http://ir.dlut.edu.cn/ThesisList%5C2009%5C%E9%9F%A9%E5%86%B0-%E5%A4%A7%E8%A7%84%E6%A8%A1%E6%96%87%E6%9C%AC%E5%8E%BB%E9%87%8D%E7%AD%96%E7%95%A5%E7%A0%94%E7%A9%B6.pdf

Wednesday, July 24, 2013

Daily Bookmarks 20130724

程式語言教學誌: Java 快速導覽 - 物件導向概念 泛型
http://pydoing.blogspot.tw/2010/12/java-generic.html
Generics
http://docs.oracle.com/javase/1.5.0/docs/guide/language/generics.html
Java Generics ? , E and T what is the difference? - Stack Overflow
http://stackoverflow.com/questions/6008241/java-generics-e-and-t-what-is-the-difference
Interfaces (The Java™ Tutorials > Learning the Java Language > Interfaces and Inheritance)
http://docs.oracle.com/javase/tutorial/java/IandI/createinterface.html


哪部电影让你看到了理想中的爱情? - 知乎
http://www.zhihu.com/question/20448308
AWS云搜索的使用:极简Java API
http://www.infoq.com/cn/articles/AmazonCloudSearch
HDFS namenode源码分析 | r6
http://www.r66r.net/?p=1093
快衝!LINE 熊大、兔兔、饅頭人貼圖免費下載中 @ :: ifans :: :: 痞客邦 PIXNET ::
http://ifans.pixnet.net/blog/post/153209316
聪明人都在绞尽脑汁让人点击广告,更聪明的人在做什么?收集和分析这些数据 |PingWest
http://pingwest.com/demo/adstage/

The Best of Tchaikovsky - YouTube
http://www.youtube.com/watch?v=7_WWz2DSnT8&list=PLcGkkXtask_fpbK9YXSzlJC4f0nGms1mI

The Best of Classical Music - YouTube
http://www.youtube.com/playlist?list=PLcGkkXtask_fpbK9YXSzlJC4f0nGms1mI
Lessons In Coding: The K&R Index of Blog Entries
http://lessonsincoding.blogspot.tw/p/c-programming-language-k.html
A Guide to Python Frameworks for Hadoop | Apache Hadoop for the Enterprise | Cloudera
http://blog.cloudera.com/blog/2013/01/a-guide-to-python-frameworks-for-hadoop/
Rethrick Construction
http://rethrick.com/#projects
Java Client调用ElasticSearch做全文搜索代码示例 - - ITeye技术网站
http://shuminghuang.iteye.com/blog/1732129
Getting started with ElasticSearch « Jai’s Weblog – Tech, Security & Fun…
http://jaibeermalik.wordpress.com/2013/03/15/getting-started-with-elasticsearch/
Elasticsearch源碼分析之一——使用Guice進行依賴注入與模塊化系統_人人IT網
http://rritw.com/a/bianchengyuyan/C__/20120920/226667.html
search - Beginner's guide to ElasticSearch - Stack Overflow
http://stackoverflow.com/questions/11593035/beginners-guide-to-elasticsearch
腾讯分析系统架构解析 -- 系统运维 -- IT技术博客大学习 -- 共学习 共进步!
http://blogread.cn/it/article/6440?f=wb

pros cons - Google 搜尋
https://www.google.com.tw/search?q=pros+cons&oq=pros+cons&aqs=chrome.0.69i57j69i65l2j69i61j69i59l2.2564j0&sourceid=chrome&ie=UTF-8
English of the day! -- Pros & Cons - [V!cT0R] - 無名小站
http://www.wretch.cc/blog/vicchen19/5909403
泛型與 Collection — Java Steps
http://javasteps.plweb.org/java_generic.html
java t generic - Google 搜尋
https://www.google.com.tw/search?q=java+t+generic&oq=java+T+ge&aqs=chrome.2.69i57j0l3j69i60l2.8155j0&sourceid=chrome&ie=UTF-8
Java Generics ? , E and T what is the difference? - Stack Overflow
http://stackoverflow.com/questions/6008241/java-generics-e-and-t-what-is-the-difference
Oracle Site Search - Secure Enterprise Search - Generics
http://search.oracle.com/search/search?start=1&search_p_main_operator=all&q=Generics
Introduction (The Java™ Tutorials > Bonus > Generics)
http://docs.oracle.com/javase/tutorial/extra/generics/intro.html
Defining Simple Generics (The Java™ Tutorials > Bonus > Generics)
http://docs.oracle.com/javase/tutorial/extra/generics/simple.html
Inheritance (The Java™ Tutorials > Learning the Java Language > Interfaces and Inheritance)
http://docs.oracle.com/javase/tutorial/java/IandI/subclasses.html
Creating Objects (The Java™ Tutorials > Learning the Java Language > Classes and Objects)
http://docs.oracle.com/javase/tutorial/java/javaOO/objectcreation.html
The Fine Print (The Java™ Tutorials > Bonus > Generics)
http://docs.oracle.com/javase/tutorial/extra/generics/fineprint.html
Generics
http://docs.oracle.com/javase/1.5.0/docs/guide/language/generics.html

Friday, July 19, 2013

Daily Bookmarks 20130719


hadoop - Pass Python scripts for mapreduce to HBase - Stack Overflow
http://stackoverflow.com/questions/14241729/pass-python-scripts-for-mapreduce-to-hbase
kennethreitz/requests
https://github.com/kennethreitz/requests/

Bulk importing Data into HBase | Deerwalk Blog - Result.Reflect.Repeat.
http://www.deerwalk.com/blog/bulk-importing-data/
2. Using Pig to Bulk Load Data Into HBase - Hortonworks Data Platform
http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.1/bk_user-guide/content/user-guide-hbase-import-2.html




29歲被開除?或留下來? - Cheers快樂工作人雜誌
http://www.cheers.com.tw/article/article.action?id=5029348

Wednesday, July 17, 2013

Daily Bookmarks 20130716

Google Dremel 原理 – 如何能3秒分析1PB | 我自然
http://www.yankay.com/google-dremel-rationale/
经典论文翻译导读之《Dremel: Interactive Analysis of WebScale Datasets》 - ImportNew
http://www.importnew.com/2617.html

数据科学与R语言: 重磅推荐:《机器学习之黑客帝国》
http://xccds1977.blogspot.tw/2012/03/blog-post.html
数据科学与R语言: 电影爱好者的R函数
http://xccds1977.blogspot.tw/2013/06/r.html

数据科学与R语言: Twitter的数据科学家是如何工作?
http://xccds1977.blogspot.tw/2012/03/twitter.html

PyCodersCN/issue12/machine-learning-for-hackers.rst at master · PyCodersCN/PyCodersCN
https://github.com/PyCodersCN/PyCodersCN/blob/master/issue12/machine-learning-for-hackers.rst
Unsupervised Learning — Clustering Analysis | 演衡學習筆記
http://c3h3notes.wordpress.com/2010/10/29/unsupervised-learning-clustering-analysis/



Friday, July 12, 2013

Daily Bookmarks 20130712

Inversion of Control Containers and the Dependency Injection pattern
http://martinfowler.com/articles/injection.html
parallel external merge sort - 碎碎唸
http://blog.yunglinho.com/blog/2013/03/19/parallel-external-merge-sort/
Dependency Injection in Scala - 碎碎唸
http://blog.yunglinho.com/blog/2012/04/22/dependency-injection-in-scala/
轻松学习Spring IoC容器和Dependency Injection模式 - JAVA涂鸦 - BlogJava
http://www.blogjava.net/rickhunter/articles/29015.html
Spring 學習筆記
http://openhome.cc/Gossip/SpringGossip/

python class - Google 搜尋
https://www.google.com.tw/search?q=python+class&oq=python+class&aqs=chrome.0.69i57j0l3j69i62l2.2145j1&sourceid=chrome&ie=UTF-8
定義類別
http://openhome.cc/Gossip/Python/Class.html
9. 類別(Classes)
http://larc.ee.nthu.edu.tw/~jcyeh/python/cdoc/tut/node11.html
5.5. Exploring UserDict: A Wrapper Class
http://www.diveintopython.net/object_oriented_framework/userdict.html
5.2. Importing Modules Using from module import
http://www.diveintopython.net/object_oriented_framework/importing_modules.html
Lesson 8 - Classes
http://www.sthurlow.com/python/lesson08/
Python Object Oriented
http://www.tutorialspoint.com/python/python_classes_objects.htm

Designing a RESTful API with Python and Flask - miguelgrinberg.com
http://blog.miguelgrinberg.com/post/designing-a-restful-api-with-python-and-flask
Flask-RESTful — Flask-RESTful 0.2.1 documentation
http://flask-restful.readthedocs.org/en/latest/index.html

MapReduce生成HFile入库到HBase - 石头儿 - 博客园
http://www.cnblogs.com/shitouer/archive/2013/02/20/hbase-hfile-bulk-load.html
【HBase工具】查看解析HFile - 我不是春晖 - ITeye技术网站
http://zjushch.iteye.com/blog/1676675
MapReduce生成HFile入库到HBase及源码分析三江小渡 | 三江小渡
http://blog.pureisle.net/archives/1950.html
用于大数据的并查集(基于HBase)的java类三江小渡 | 三江小渡
http://blog.pureisle.net/archives/2033.html








Sunday, July 07, 2013

Daily Bookmarks 20130707

Trie 的原理和实现 (python 实现) - ChenQi的个人空间 - 开源中国社区
http://my.oschina.net/u/158589/blog/61037
读书:《编程珠玑》第十五章及后缀数组的Python实现和后缀树 | Silent Kogorou Mouri
http://pengwang.me/2013/04/27/%e8%af%bb%e4%b9%a6%ef%bc%9a%e3%80%8a%e7%bc%96%e7%a8%8b%e7%8f%a0%e7%8e%91%e3%80%8b%e7%ac%ac%e5%8d%81%e4%ba%94%e7%ab%a0-%e5%8f%8a-%e5%90%8e%e7%bc%80%e6%95%b0%e7%bb%84%e7%9a%84python%e5%ae%9e%e7%8e%b0/
Trie树的Python实现 | Silent Kogorou Mouri
http://pengwang.me/2013/04/25/trie%E6%A0%91%E7%9A%84python%E5%AE%9E%E7%8E%B0/
Trie树的Python实现 | hbprotoss的博客
http://hbprotoss.github.io/posts/trieshu-de-pythonshi-xian.html
对Python中文分词模块结巴分词算法过程的理解和分析 | seanhuang 技术点滴
http://seanhuang.me/?p=542
- Django梦之队(DDTCMS官网)
http://ddtcms.com/blog/archive/2013/2/17/70/how-to-begin-to-study-the-chinese-word-segmentation/

Trie in Python | 我爱正则表达式
http://iregex.org/blog/trie-in-python.html
使用python代码实现三叉搜索树高效率”自动输入提示”功能
http://www.starming.com/index.php?action=plugin&v=wave&tpl=union&ac=viewgrouppost&gid=73&tid=17520


Attlin
http://www.attlin.com/
12、backbone实战:web在线聊天室(backbone+django+sqlite)(一)功能分析 | the5fire的技术博客
http://www.the5fire.com/12-backbone-webchat-1.html
说说我这个博客的架构 | the5fire的技术博客
http://www.the5fire.com/blog-architecture.html
7、backbone实例todos分析(一) | the5fire的技术博客
http://www.the5fire.com/7-backbone-todos-1.html









Friday, July 05, 2013

Dairy Bookmarks 20130705

Understanding the Parallelism of a Storm Topology - Michael G. Noll
http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/
Record 格式
http://irc.ccu.edu.tw/tools/page/show_page.php?page_url=/Site/web/dir_517fb4349001b/article_517fb6d84d3b6.html
Efficiently Reading in and Iterating Through Large Files with Python ~ Optinalysis
http://www.nikhilgopal.com/2010/12/dealing-with-large-files-in-python.html

MogileFS 的介绍(MogileFS 系列1) 扶凯
http://www.php-oa.com/2010/09/26/perl-mogilefs-1.html
Data IAP Day 1
http://dataiap.github.io/dataiap/day4/
OReilly – Hadoop The Definitive Guide (06-2009) « Xu Fei's Blog
http://autofei.wordpress.com/2010/06/27/oreilly-hadoop-the-definitive-guide-06-2009/
Java Example Code using HBase Data Model Operations « Xu Fei's Blog
http://autofei.wordpress.com/2012/04/02/java-example-code-using-hbase-data-model-operations/


Wu Mamber (String Algorithms 2007)
http://www.slideshare.net/mailund/wu-mamber-string-algorithms-2007
Memory Dump | 基于后缀搜索的多模式匹配算法——Wu-Manber算法
https://memorycn.wordpress.com/2011/11/05/matching_algorithm_-_wu-manber_algorithm_based_on_the_the_suffix_search_of_multi-mode/


Pig Macro for TF-IDF Makes Topic Summarization 2 Lines of Pig | Hortonworks
http://hortonworks.com/blog/pig-macro-for-tf-idf-makes-topic-summarization-2-lines-of-pig/
(7) TF-IDF in 2 lines of code with Pig Macros - Hadoop, Data, and Systems - Quora
http://hadoop-data-systems.quora.com/TF-IDF-in-2-lines-of-code-with-Pig-Macros
The Brotherhood of coders: Document similarity using Hadoop
http://coderscreed.blogspot.tw/2012/12/document-similarity-using-hadoop.html
TF-IDF in Hadoop Part 3: Documents in Corpus and TFIDF Computation | Marcello de Sales' Blog
http://marcellodesales.wordpress.com/2010/01/10/tf-idf-in-hadoop-part-3-documents-in-corpus-and-tfidf-computation/

Quickstart — Flask 0.10.1 documentation
http://flask.pocoo.org/docs/quickstart/
flask-tumblelog/tumblelog/admin.py at master · rozza/flask-tumblelog · GitHub
https://github.com/rozza/flask-tumblelog/blob/master/tumblelog/admin.py
Write a Tumblelog Application with Flask and MongoEngine — MongoDB Manual 2.4.5
http://docs.mongodb.org/manual/tutorial/write-a-tumblelog-application-with-flask-mongoengine/



增强版《Hadoop数据分析平台》第八期(增加5周内容),约等于免费的逆向收费式网络培_Hadoop与分布式数据处理_ITPUB论坛-it168旗下专业技术社区
http://www.itpub.net/thread-1629863-1-1.html

HBaseWD: Avoid RegionServer Hotspotting Despite Sequential Keys | Sematext Blog
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
电商推荐系统迷思
http://www.infoq.com/cn/presentations/electricity-supplier-recommendation-system-thinking
Bit.ly发布Forget-Table,解决非稳定类别分布问题
http://www.infoq.com/cn/news/2013/02/bitly-forget-table

演讲
http://www.infoq.com/cn/presentations/60
腾讯微博架构的成长过程
http://www.infoq.com/cn/presentations/tencent-blog-structure-growup
京东云存储服务和应用探索
http://www.infoq.com/cn/presentations/jingdong-cloud-storage-services-applications-explore
Partition-Tolerance - Google 搜尋
https://www.google.com.tw/search?q=Partition-Tolerance&source=lnt&tbs=lr:lang_1zh-CN%7Clang_1zh-TW&lr=lang_zh-CN%7Clang_zh-TW&sa=X&ei=QffMUePUL4avkgWO-4HICQ&ved=0CBYQpwUoAQ&biw=1264&bih=711
keyword tf idf - Google 搜尋
https://www.google.com.tw/search?q=keyword+tf+idf&ei=7lzNUe20FsavkgXv64CwCw&start=10&sa=N&biw=1264&bih=711
Keyword Extraction Based on tf/idf for Chinese News Document
http://d.wanfangdata.com.cn/Periodical_whdxxb-e200705030.aspx
【转】关键字提取算法之TF-IDF扫盲 - 码农.KEN - 博客园
http://www.cnblogs.com/ken-zhang/archive/2010/06/20/1761108.html
國立交通大學開放式課程(OpenCourseWare, OCW)
http://ocw.nctu.edu.tw/course_detail_3.php?bgid=9&gid=0&nid=413&v1=82a09096121314b8298ca6a3259b732e24e5a073#.UdaPKz5NtcO



Daily Bookmarks 20130703_2


ALGORITHMIC ETUDES: Map-reduce for pairwise document similarity calculation
http://algoetudes.blogspot.tw/2012/12/map-reduce-for-pairwise-document.html
【hadoop】大规模中文网站聚类kmeans的mapreduce实现(上) - lawrencesgj的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/lawrencesgj/article/details/8606532
【hadoop】大规模中文网站聚类kmeans的mapreduce实现(下) - lawrencesgj的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/lawrencesgj/article/details/8606570







en → zh-TW
文件
名詞: 文件, 文獻, 議案