Saturday, March 31, 2012

Daily Bookmarks 20120331

科学网—中文分词 - 刘伟的博文
http://blog.sciencenet.cn/home.php?mod=space&uid=5691&do=blog&id=11656
科学网—四大名著的词频统计 - 刘伟的博文
http://blog.sciencenet.cn/home.php?mod=space&uid=5691&do=blog&id=11657
构造汉语的统计计算语言模型 hit
https://docs.google.com/viewer?a=v&q=cache:BON9V47rqREJ:ir.hit.edu.cn/~zhangyu/%25D6%25D0%25CE%25C4%25D0%25C5%25CF%25A2%25B4%25A6%25C0%25ED/%25BA%25BA%25D3%25EF%25B7%25D6%25B4%25CA--%25D5%25C5%25D3%25EE.ppt+&hl=zh-TW&pid=bl&srcid=ADGEESivQFLhy8Q7lui5R15EGgdRSuM5xt4D3RFmw5Ht3OhaSOV-OJN6jMajy8lP7rSHxidjTJxNgzTSEjAiaxo2atbbZ7oDvPlYJPb7vjVxRSQaetd-s-i6FioJMsMOSN5t4aKyDch8&sig=AHIEtbTdgCr9Zyh7qwyHSQQZAfn6_cR2gA
Unicode In Python, Completely Demystified
http://farmdev.com/talks/unicode/
Python筆記:產生N-gram以及簡單頻率統計 @ Freedom is not free :: 痞客邦 PIXNET ::
http://hambao.pixnet.net/blog/post/18823664-python%E7%AD%86%E8%A8%98%EF%BC%9A%E7%94%A2%E7%94%9Fn-gram%E4%BB%A5%E5%8F%8A%E7%B0%A1%E5%96%AE%E9%A0%BB%E7%8E%87%E7%B5%B1%E8%A8%88
使用正向最大匹配算法实现中文分词简单模型-用trie树实现 - lyflower的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/lyflower/article/details/1452091
中文分词入门之最大匹配法 | 我爱自然语言处理
http://www.52nlp.cn/maximum-matching-method-of-chinese-word-segmentation
程式設計實習(二) Computer Programming Lab.(II) - hongjiedai
https://sites.google.com/site/hongjiedai/course/yzu_nlp_lecture
全文檢索:斷詞 @ 電腦資訊 :: 隨意窩 Xuite日誌
http://blog.xuite.net/sugopili/computerblog/12762738-%E5%85%A8%E6%96%87%E6%AA%A2%E7%B4%A2%EF%BC%9A%E6%96%B7%E8%A9%9E
基于用字共现频率统计的外国译名自动识别-【维普网】-仓储式在线作品出版平台-www.cqvip.com
http://www.cqvip.com/QK/95033X/201201/40567330.html

基于共词分析的国外图书情报学研究热点

159.226.100.150:8085/.../downloadArticleFile.do?... - 轉為繁體網頁
檔案類型: PDF/Adobe Acrobat - 快速檢視
由 郭春侠 著作 - 相關文章
通过高频关键词统计、高频词共词矩阵分析和SPSS软件的聚类分析研究方法和数据处理手. 段,得到 ... 某一学科或领域的学术热点及其发展趋势,通. 过关键词词频 ...
一種快速網頁檢索結果聚類策略 Fast clustering strategy for web search result
http://d.wanfangdata.com.cn/Periodical_jsjgcyyy201112032.aspx
中文分词入门之字标注法2 | 我爱自然语言处理
http://www.52nlp.cn/%e4%b8%ad%e6%96%87%e5%88%86%e8%af%8d%e5%85%a5%e9%97%a8%e4%b9%8b%e5%ad%97%e6%a0%87%e6%b3%a8%e6%b3%952
共现频率 - Google 搜尋
https://www.google.com/search?q=%E5%85%B1%E7%8E%B0%E9%A2%91%E7%8E%87&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox-a
基于统计方法的中文姓名识别 - Google 搜尋
https://www.google.com/search?hl=zh-TW&client=firefox-a&hs=Hi2&rls=org.mozilla%3Azh-TW%3Aofficial&q=%E5%9F%BA%E4%BA%8E%E7%BB%9F%E8%AE%A1%E6%96%B9%E6%B3%95%E7%9A%84%E4%B8%AD%E6%96%87%E5%A7%93%E5%90%8D%E8%AF%86%E5%88%AB&oq=%E5%9F%BA%E4%BA%8E%E7%BB%9F%E8%AE%A1%E6%96%B9%E6%B3%95%E7%9A%84%E4%B8%AD%E6%96%87%E5%A7%93%E5%90%8D%E8%AF%86%E5%88%AB&aq=f&aqi=&aql=&gs_l=serp.3...1866l1866l0l2157l1l1l0l0l0l0l117l117l0j1l1l0.llsin.
中文分词_百度百科
http://baike.baidu.com/view/19109.htm
Indexes Of /NLP/中文分词/中文分词/
http://books.ithunder.org/NLP/%E4%B8%AD%E6%96%87%E5%88%86%E8%AF%8D/%e4%b8%ad%e6%96%87%e5%88%86%e8%af%8d/

倒排文件索引结构。该结构及相应的生成算法如下
http://books.ithunder.org/NLP/%E7%B4%A2%E5%BC%95/%e5%80%92%e6%8e%92%e7%b4%a2%e5%bc%95/%e5%80%92%e6%8e%92%e7%b4%a2%e5%bc%95%e5%8e%9f%e7%90%86.txt
聚集索引和非聚集索引-一起jquery|jquery专业学习网站|专注于Web技术<
http://www.17jquery.com/mssql/18811/
基于聚类分析的中文新闻网页关键词提取方法研究 - docin.com豆丁网
http://www.docin.com/p-121688687.html
信息检索系统——学习线路 - 009 - ITeye技术网站
http://huangfoxagain.iteye.com/blog/905263
基于维基百科的搜索引擎检索结果聚类(论文) - 生活指南 - 道客巴巴
http://www.doc88.com/p-099209463831.html
基于用字共现频率统计的外国译名自动识别-【维普网】-仓储式在线作品出版平台-www.cqvip.com
http://www.cqvip.com/Read/Read.aspx?id=40567330
基于标签关键词的用户行为分析._百度文库
http://wenku.baidu.com/view/7592c5671ed9ad51f01df2a3.html

















-end-

Thursday, March 29, 2012

Daily Bookmarks 20120329

Doing optimizations and doing it in Python - amix.dk
http://amix.dk/blog/post/19407#Doing-optimizations-and-doing-it-in-Python
Todoist progression - amix.dk
http://amix.dk/blog/post/19062#Todoist-progression
Nested lists in Todoist - amix.dk
http://amix.dk/blog/post/19058#Nested-lists-in-Todoist
Sortable list in less than 50 lines of code - amix.dk
http://amix.dk/blog/post/19166
Python list filtering - amix.dk
http://amix.dk/blog/post/19297
What could be improved in Python - amix.dk
http://amix.dk/blog/post/68#What-could-be-improved-in-Python
Small iterators - amix.dk
http://amix.dk/blog/post/19061#Small-iterators
The art of scaling - amix.dk
http://amix.dk/blog/post/19459#The-art-of-scaling
Plurk Comet: Handling of 100.000+ open connections - amix.dk
http://amix.dk/blog/post/19456#Plurk-Comet-Handling-of-100-000-open-connections
Why we can't find a theory of everything - amix.dk
http://amix.dk/blog/post/19457#Why-we-cant-find-a-theory-of-everything
How Twitter (and Facebook) solve problems partially - amix.dk
http://amix.dk/blog/post/19464#How-Twitter-and-Facebook-solve-problems-partially
My thoughts on real time full-text search « LShift Ltd.
http://www.lshift.net/blog/2009/01/20/my-thoughts-on-real-time-full-text-search
Plurk: Instant conversations using Comet - amix.dk
http://amix.dk/blog/post/19490#Plurk-Instant-conversations-using-Comet
Python's set datatype - amix.dk
http://amix.dk/blog/post/19397#Pythons-set-datatype
Don't build simple to succeed - amix.dk
http://amix.dk/blog/post/19335#Dont-build-simple-to-succeed
WebOb — WSGI request and response objects
http://www.webob.org/
Thinking a code design through - amix.dk
http://amix.dk/blog/post/19362#Thinking-a-code-design-through
Beautiful Python documentation using Sphinx - amix.dk
http://amix.dk/blog/post/19366#Beautiful-Python-documentation-using-Sphinx
Consistent hashing implemented simply in Python - amix.dk
http://amix.dk/blog/post/19367#Consistent-hashing-implemented-simply-in-Python
Jottit - amix.dk
http://amix.dk/blog/post/19262#Jottit
My idea - amix.dk
http://amix.dk/blog/post/19231#My-idea
Most Used Text Tags on the Internet (MUTTI) - amix.dk
http://amix.dk/blog/post/34#Most-Used-Text-Tags-on-the-Internet-MUTTI
3 useful Python scritps - amix.dk
http://amix.dk/blog/post/79#3-useful-Python-scritps
Pocoo - Python forum - amix.dk
http://amix.dk/blog/post/173#Pocoo-Python-forum
Smart use of partial - amix.dk
http://amix.dk/blog/post/19094#Smart-use-of-partial
Todoist collapses - amix.dk
http://amix.dk/blog/post/19127#Todoist-collapses
Convert flat data into a hierarchical python list - Stack Overflow
http://stackoverflow.com/questions/1740107/convert-flat-data-into-a-hierarchical-python-list
python - Converting tree list to hierarchy dict - Stack Overflow
http://stackoverflow.com/questions/757244/converting-tree-list-to-hierarchy-dict
Storing Hierarchical Data in a Database Article - SitePoint
http://www.sitepoint.com/hierarchical-data-database/
Why Python for web-development? - amix.dk
http://amix.dk/blog/post/19203#Why-Python-for-web-development
ping不見路: Plurk source code 蛛絲馬跡:Python & MySQL
http://pingyeh.blogspot.com/2008/09/plurk-source-code-python-mysql.html
HMM学习最佳范例四:隐马尔科夫模型 | 我爱自然语言处理
http://www.52nlp.cn/hmm-learn-best-practices-four-hidden-markov-models
程式語言 - Primary Key的產生方式
http://itschool.dgbas.gov.tw/blog/post.do?bid=5&pid=410
读书笔记 <<你的知识需要管理>>
http://www.cntxk.com/CataNews/68/info11040.html
统计模型之间的比较 | LiXiang
http://www.leexiang.com/comparison-between-the-statistical-models
《命名实体识别研究》一文的笔记 - 可微小凡 - 博客园
http://www.cnblogs.com/keweixiaofan/archive/2010/03/18/1689035.html
北京大学计算机研究所多媒体信息处理研究室 : 研究成果
http://www.icst.pku.edu.cn/mipl/tiki-index.php?page=%E7%A0%94%E7%A9%B6%E6%88%90%E6%9E%9C
: 作業十七:互訊息Mutual Information - yam天空部落
http://blog.yam.com/Wfin/article/15631028
互資訊與條件熵 (Mutual Information and Condictional Entropy) - 陳鍾誠的網站
http://ccckmit.wikidot.com/st:mutualinformation
中山美麗之島 / 精華區 / psychology / 淺論中文斷詞
http://bbs.nsysu.edu.tw/txtVersion/treasure/psychology/M.855653188.A/M.931316362.A/M.931316395.A.html
Mutual information and Normalized Mutual information 互信息和标准化互信息 - 子桥 - 博客园
http://www.cnblogs.com/ziqiao/archive/2011/12/13/2286273.html
Mutual information(MI) and Normalized mutual information(NMI) for numpy - 想法太多,希望太少
http://blog.sun.tc/2010/10/mutual-informationmi-and-normalized-mutual-informationnmi-for-numpy.html













-end-

Wednesday, March 28, 2012

Daily Bookmarks 20120328

python抓取所有航空公司新闻 - - ITeye技术网站
http://junfeng-feng.iteye.com/blog/1010054
pyQuery [Python俱乐部]
http://www.pythonclub.org/modules/pyquery
python版本的jquery | 百变贝贝
http://www.juyimeng.com/python-jquery.html
抓取扬子晚报当天时评版的源码 - 华华的日志 - 网易博客
http://jiangauthor.blog.163.com/blog/static/14179949201081124722674/
一段成功的pyquery中文代码 - 华华的日志 - 网易博客
http://jiangauthor.blog.163.com/blog/static/141799492010811102056817/
Python网页抓取:获取页面中某段内容的xpath - Kerwin的工作日志 - 博客频道 - CSDN.NET
http://blog.csdn.net/kerwin_liu/article/details/6407094
用Python实现调用Google翻译 | I'm TualatriX
http://imtx.me/archives/650.html
昭佑.天翔: HTML CSS 如何表達 Class Name 裡面的 space 空白
http://tomkuo139.blogspot.com/2010/03/html-css-class-name-space.html
jQuery + Firebug组合,超强! - dmcpxy的博客 - ITeye技术网站
http://alfoo.iteye.com/blog/190575
利用FireBug使JQuery的学习更加轻松愉快 - CareySon - 博客园
http://www.cnblogs.com/CareySon/archive/2010/01/01/1637535.html
zhihu-to-renren/zhihu_to_renren.py at master · bojanliu/zhihu-to-renren · GitHub
https://github.com/bojanliu/zhihu-to-renren/blob/master/zhihu_to_renren.py
此博客网站的开发总结 | NewLiu.com
http://newliu.com/post/3/
Django学习笔记—Comments库的使用方法小记 | NewLiu.com
http://newliu.com/post/11/
Python学习笔记—PyQuery库的使用总结 | NewLiu.com
http://newliu.com/post/18/
Python UnicodeEncodeError: 'ascii' codec can't encode character « SaltyCrane Blog
http://www.saltycrane.com/blog/2008/11/python-unicodeencodeerror-ascii-codec-cant-encode-character/
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 99: ordinal not in range(128)
http://www.velocityreviews.com/forums/t351112-unicodeencodeerror-ascii-codec-cant-encode-character-u-xe1-in-position-99-ordinal-not-in-range-128-a.html
Python tips: 什么是*args和**kwargs? - MK2 - 博客园
http://www.cnblogs.com/fengmk2/archive/2008/04/21/1163766.html
Command Line Arguments In Python - Stack Overflow
http://stackoverflow.com/questions/1009860/command-line-arguments-in-python
魚乾的筆記本: 書:學徒模式 Apprenticeship Patterns
http://kevyu.blogspot.com/2011/08/apprenticeship-patterns.html
Git 初學筆記 - 指令操作教學 | Tsung's Blog
http://blog.longwin.com.tw/2009/05/git-learn-initial-command-2009/
css_image_concat: Improve performance by concating your images - amix.dk
http://amix.dk/blog/post/19682#css-image-concat-Improve-performance-by-concating-your-images



-end-

Tuesday, March 27, 2012

Daily Bookmarks 20120327

Reddit 评级算法的工作原理 — PyCoder's Weelky CN
http://pycoders-weekly-chinese.readthedocs.org/en/latest/issue6/how-reddit-ranking-algorithms-work.html
Unicdoe之痛 — PyCoder's Weelky CN
http://pycoders-weekly-chinese.readthedocs.org/en/latest/issue5/unipain.html
Paul Graham:最伟大的创意都是令人恐惧的:(一)做一个新的搜索引擎 | 36氪
http://www.36kr.com/p/89586.html
熱門討論 | Beauty - PTTOnline
http://www.pttonline.cc/Beauty/?sortby=date
在 Python 中使用模糊匹配根据发音搜索 — PyCoder's Weelky CN
http://pycoders-weekly-chinese.readthedocs.org/en/latest/issue4/using-fuzzy-matching-to-search-by-sound-with-python.html





-end-

Sunday, March 25, 2012

Daily Bookmarks 20120325

cat >> ~lyxint/notes » 同步twitter到新浪微博,人人,etc…
http://lyxint.com/archives/106
feilaoda/FlickBoard · GitHub
https://github.com/feilaoda/FlickBoard
我的2012计划 | heroicYang's Blog
http://www.heroicyang.com/%e6%88%91%e7%9a%842012%e8%ae%a1%e5%88%92/
Doit.im Blog
http://enblog.doit.im/
V2EX › 花了三天时间做了个GAE小应用:猜电影
http://v2ex.appspot.com/t/12915#reply30
做完猜电影的一点感想 - 无网不剩
http://blog.leezhong.com/essay/2011/05/30/iguess-feeling.html
lzyy/iguess
https://github.com/lzyy/iguess
have you lost yourself? - 无网不剩
http://blog.leezhong.com/essay/2011/05/23/have-you-lost-your-self.html
php实现实时通信 - 无网不剩
http://blog.leezhong.com/tech/2011/03/21/php-comet.html
web开发从小工到大家 - 无网不剩
http://blog.leezhong.com/tech/2010/12/18/web-development-journeyman-master.html
为什么没有人投资 Livid 的 V2EX.com ? - DBA Notes
http://www.dbanotes.net/review/livid_v2ex.html
cat >> ~lyxint/notes » 同步twitter到新浪微博,人人,etc…
http://lyxint.com/archives/106
我的2012计划 | heroicYang's Blog
http://www.heroicyang.com/%e6%88%91%e7%9a%842012%e8%ae%a1%e5%88%92/
The Python Paradox
http://www.paulgraham.com/pypar.html

-end-

Saturday, March 24, 2012

Daily Bookmarks 20120324

Felix Ding - Blog - SPAM、Bayesian和中文 2
http://dingyu.me/blog/spam-bayesian-chinese-2
貝氏定理(上) – Monty Hall 的三扇門 – MMDays
http://mmdays.com/2008/04/29/bayes1/
貝氏定理(下) – 99%的準確度 – MMDays
http://mmdays.com/2008/05/04/bayes2/
貝氏定理 - 维基百科,自由的百科全书
http://zh.wikipedia.org/wiki/Bayes%E5%AE%9A%E7%90%86
Google 搜尋引擎使用的矩陣運算 | 線代啟示錄
http://ccjou.wordpress.com/2009/05/02/google-%e6%90%9c%e5%b0%8b%e5%bc%95%e6%93%8e%e4%bd%bf%e7%94%a8%e7%9a%84%e7%9f%a9%e9%99%a3%e9%81%8b%e7%ae%97/
貝氏定理——量化思考的利器 | 線代啟示錄
http://ccjou.wordpress.com/2009/04/29/%E8%B2%9D%E6%B0%8F%E5%AE%9A%E7%90%86-%E9%87%8F%E5%8C%96%E6%80%9D%E8%80%83%E7%9A%84%E5%88%A9%E5%99%A8/
蛛網: 以貝氏定理分析搭訕男人的畜牲心裡 - yam天空部落
http://blog.yam.com/mmadcity/article/30734147
18世纪的贝叶斯定理成为Google计算的新力量 - Net130.com
http://www.net130.com/CMS/Pub/news/164546.htm

基于用户投票的排名算法(三):Stack Overflow - 阮一峰的网络日志
http://www.ruanyifeng.com/blog/2012/03/ranking_algorithm_stack_overflow.html

贝叶斯推断及其互联网应用(一) - 阮一峰的网络日志
http://www.ruanyifeng.com/blog/2011/08/bayesian_inference_part_one.html
贝叶斯推断及其互联网应用(二) - 阮一峰的网络日志
http://www.ruanyifeng.com/blog/2011/08/bayesian_inference_part_two.html
Ross Poulton
http://www.rossp.org/




-end-

Daily Bookmarks 20120323

A Simple E-Shop Application Using PHP and MySQL - KICCP Blog
http://blog.kiccp.com/34.html
记一个基于SAE 的关于宝宝日记博客主题应用的诞生 - SAEPy blog
http://www.saespot.com/topic/64/%E8%AE%B0%E4%B8%80%E4%B8%AA%E5%9F%BA%E4%BA%8Esae-%E7%9A%84%E5%85%B3%E4%BA%8E%E5%AE%9D%E5%AE%9D%E6%97%A5%E8%AE%B0%E5%8D%9A%E5%AE%A2%E4%B8%BB%E9%A2%98%E5%BA%94%E7%94%A8%E7%9A%84%E8%AF%9E%E7%94%9F#r196



-end-

Friday, March 23, 2012

Daily Bookmarks 20120322

宅之力: 解決方法: UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)
http://blog.wahahajk.com/2009/08/unicodedecodeerror-ascii-codec-cant.html
How to Use UTF-8 with Python (evanjones.ca)
http://www.evanjones.ca/python-utf8.html
查詢簡介
http://140.135.41.13/query/search_help.html
'ascii' codec can't decode byte - Google 搜尋
https://www.google.com/search?q=%27ascii%27+codec+can%27t+decode+byte+&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox-a
关于C-DBLP
http://www.cdblp.cn/about_us.php
Abstract - SpringerLink Duplicate Identification in Deep Web Data Integration
http://www.springerlink.com/content/j6l154645353704h/
Efficient Algorithms for Approximate Member Extraction Using Signature-based Inverted Lists. - Google 搜尋
https://www.google.com/search?q=Efficient+Algorithms+for+Approximate+Member+Extraction+Using+Signature-based+Inverted+Lists.&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox-a
Informatics@TaiBIF » 貝氏分類法(Bayes classifier)
http://taibif.org.tw/informatics/?p=452
fcamel's blog: Naive Bayes Classifier 的原理(單刀直入版)
http://fcamel-fc.blogspot.com/2009/02/naive-bayes-classifier.html
Naive Bayes classifier in 50 lines
http://ebiquity.umbc.edu/blogger/2010/12/07/naive-bayes-classifier-in-50-lines/
PyCon US 2012 sessions about Bayesian Classifier and Python
http://lanyrd.com/2012/pycon/schedule/?topics=python,bayesian-classifier
machine learning - Naive Bayes classifier using python - Stack Overflow
http://stackoverflow.com/questions/9677603/naive-bayes-classifier-using-python
fcamel's blog: Naive Bayes Classifier 的原理
http://fcamel-fc.blogspot.com/2009/02/naive-bayes-classifier_19.html
Naive Bayes的Python实现 – 四号程序员
http://www.coder4.com/archives/1511
IT牛人博客聚合 - 语义情感趋势分析入门的一份译稿
http://m.udpwork.com/item/2663.html
Artificial Intelligence in Motion: Working on Sentiment Analysis on Twitter with Portuguese Language
http://aimotion.blogspot.com/2010/07/working-on-sentiment-analysis-on.html
Movie Reviews Naive Bayes Subjectivity Classifier for Python NLTK data set | Infochimps
http://www.infochimps.com/datasets/movie-reviews-naive-bayes-subjectivity-classifier-for-python-nlt#overview_tab
Andrej's Machine Learning Lectures
http://karpathy.ca/mlsite/lecture2.php
Wybiral: Naive Bayes (and author detection)
http://davywybiral.blogspot.com/2011/04/naive-bayes-and-author-detection.html
hoamon's sandbox: 中央民意代表的多樣性
http://blog.hoamon.info/2008/01/blog-post.html

Probabilistic Networks and Fuzzy Clustering as Generalizations of ...

borgelt.net/papers/dofd_98.pdf - 翻譯這個網頁
基于Ruby和中文分词的贝叶斯反垃圾解决方案 - Ruby - ChinaUnix.net -
http://bbs.chinaunix.net/thread-3622627-1-1.html
人工智能算法—朴素贝叶斯分类 - 腾讯soso团队博客 - 博客频道 - CSDN.NET
http://blog.csdn.net/soso_blog/article/details/5836953


-end-

Wednesday, March 21, 2012

Daily Bookmarks 20120321

python - web.py import in template, want to understand how it works - Stack Overflow
http://stackoverflow.com/questions/5047009/web-py-import-in-template-want-to-understand-how-it-works
PyCon US 2012 视频分享 - 谁抢了我的刺猬 - 博客园
http://www.cnblogs.com/qdwang/archive/2012/03/19/2406402.html
Analysis of a Probabilistic Record Linkage Technique without Human Review
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1479910/
python - approximate search in a database - Stack Overflow
http://stackoverflow.com/questions/7741082/approximate-search-in-a-database
如何確定創紀錄的每一個源,同一人
http://www.zhtwco.info/index.php?db=so&id=122990
一种基于语义及统计分析的Deep Web实体识别机制
http://www.jos.org.cn/1000-9825/19/194.htm
References
http://pub.chinasciencejournal.com/article/getReference.action?articleId=2211&journalId=542&isOnlineFirst=0&issn=1000-9825&journalDefName=
初识中文分词技术 - 北漂小石的博客
http://www.niutian365.com/blog/article.asp?id=230
A Survey of Indexing Techniques for Scalable Record Linkage and Deduplication
http://www.computer.org/portal/web/csdl/doi/10.1109/TKDE.2011.127
Analysis of a Probabilistic Record Linkage Technique without Human Review
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1479910/
twitter UI已成习惯? – Ued/others – JsLover
http://jslover.com/?p=305

-end-

Tuesday, March 20, 2012

Daily Bookmarks 20120320

bash regex =~ case insensetive, possible? - The UNIX and Linux Forums
http://www.unix.com/shell-programming-scripting/110312-bash-regex-case-insensetive-possible.html
Loops in Bash
http://www.softpanorama.org/Scripting/Shellorama/Control_structures/loops.shtml
Bash For Loop Examples
http://www.cyberciti.biz/faq/bash-for-loop/
Bash Regular Expressions | Linux Journal
http://www.linuxjournal.com/content/bash-regular-expressions
Conditional Constructs - Bash Reference Manual
http://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html
Bash in-process regular expressions
http://aplawrence.com/Linux/bash-regex.html
bash regex =~ case insensetive, possible? - The UNIX and Linux Forums
http://www.unix.com/shell-programming-scripting/110312-bash-regex-case-insensetive-possible.html
Bash in-process regular expressions
http://aplawrence.com/Linux/bash-regex.html
goagent教程详细版|猫理会
http://maolihui.com/goagent-detailed-version-of-the-tutorial.html
goagent - a gae proxy forked from gappproxy/wallproxy - Google Project Hosting
http://code.google.com/p/goagent/
Gashero Geek Front: The Design Rules of API
http://gashero.blogspot.com/2012/02/design-rules-of-api.html
Get the Favicon Image of any Website with Google S2 Converter
http://www.labnol.org/internet/get-favicon-image-of-websites-with-google/4404/
通过 Google 把 FavIcon 缓存到本地 - quakemachine
http://www.quakemachinex.com/blog/?p=189
javascript - Get website's favicon with JS - Stack Overflow
http://stackoverflow.com/questions/2057636/get-websites-favicon-with-js






-end-

Monday, March 19, 2012

[survery]1203 phone

萬元左右的Android智慧型手機 @ 好人超的第五個窩 :: 痞客邦 PIXNET ::
http://james803.pixnet.net/blog/post/41990081
John's POV 小約翰觀點~: [Galaxy S mini介紹篇] S5830 為什麼叫Galaxy Ace 王者機?開箱影片、外觀、規格初體驗
http://john547.blogspot.com/2011/04/galaxy-s-mini-s5830-galaxy-ace.html
[心得] 台灣大哥大學生專案 - 看板 MobileComm - 批踢踢實業坊
http://webcache.googleusercontent.com/search?q=cache:t-Dii73UUkUJ:www.ptt.cc/bbs/MobileComm/M.1329114388.A.3AB.html+&cd=18&hl=zh-TW&ct=clnk&client=firefox


-end-

Daily Bookmarks 20120319

Markdown 輕量標記語言介紹與網路資源 - 思創軟體
http://lyhdev.com/note:markdown#dokuwiki_markdown
Dokuwiki + S5 自動產生投影片
http://blog.lyhdev.com/2012/01/dokuwiki-s5.html
杂乱的书桌: 我学习,我思考,我记录
http://www.quhuashuai.com/
10个最有前途的年轻开源项目_IT业界_西部e网
http://webcache.googleusercontent.com/search?q=cache:E_vU5ifwwVIJ:www.weste.net/2010/10-28/73248.html+&cd=10&hl=zh-TW&ct=clnk&client=firefox-a
千慮愚者: Markov chain model (馬可夫鍊模型) 與求解演算法 forward/backward algorithm
http://learn-and-think.blogspot.com/2010/05/markov-chain-model-forwardbackward.html
GTD-Free Home
http://gtd-free.sourceforge.net/index.html
利用反向代理复活dropbox外链 - 陶哥的博客
http://www.wxtg.info/2011/11/62001.html
产品
http://yubosun.akcms.com/product/
网易新闻的乌龙:title写死了
http://yubosun.akcms.com/product/wangyi-xinwen.htm
使用SQLite数据库是中小站点CMS的最佳选择
http://yubosun.akcms.com/tech/sqlite-cms.htm
又做了一个图书网站
http://yubosun.akcms.com/product/seo-wangzhan-books-list.htm
5条简单命令分析百度蜘蛛的爬行记录
http://yubosun.akcms.com/tech/mingling-fenxi-zhizhu.htm
AKCMS 遵循的几条数据库设计原则
http://yubosun.akcms.com/tech/akcms-database-schema.htm
AKCMS 站内搜索功能设计草稿
http://yubosun.akcms.com/tech/akcms-search-design.htm
AKCMS静态分页功能实现方案
http://yubosun.akcms.com/tech/akcms-page-scheme.htm

-end-

Sunday, March 18, 2012

Daily Bookmarks 20120318

Duotone Theme — WordPress.com
http://theme.wordpress.com/themes/duotone/
Will's blog
http://www.bluesock.org/~willg/blog/
willkg (Will Kahn-Greene)
https://github.com/willkg
涂飞平的博客空间 - My blog, my life
http://www.tufeiping.com/index.cgi
PyBlosxom 重启 定制笔记 @ 2006-01-01 23:23 - Zoom.Quiet's PyBlosxom blogging
http://blog.zoomquiet.org/pyblosxom/techic/PyBlosxom/PyblosxomInstallog-2006-01-01-23-23.html
Python 博客系统,Kukkaisvoima 12 发布 - 开源软件 - OPEN开源资讯
http://www.open-open.com/news/view/37b4a0
Kukkaisvoima - lightweight blog engine
http://23.fi/kukkaisvoima/
博客 - 免费简洁的办公室聊天软件Besteam - 企业聊天软件 | 局域网聊天软件 | 发送文件 | 群组聊天 | 记事本 | 办公室 | 个人助理
http://besteam.im/blogs/
压力很大的BLOG
http://ipconfiger.github.com/
《在路上 …》 Locality Sensitive Hash - 张沈鹏,在路上... - ITeye技术网站
http://zsp.iteye.com/blog/769030
MarkDown 编辑器 — LinuxTOY
http://linuxtoy.org/archives/markdown-wysiwyg-editor.html
MaDe Markdown Editor | SHELLEX!
http://shellex.info/ma-de-markdown-wysiwyg-editor/
charles leifer | Writing a real-time chat app using Hookbox and Flask
http://charlesleifer.com/blog/writing-a-real-time-chat-app-using-hookbox-and-flask/
img.ly photo sharing service for twitter
http://img.ly/images/2487618/full
Intermediate Python on Google App Engine: Creating blog engine
http://forum.codecall.net/python-tutorials/34212-python-google-app-engine-creating-blog-engine.html
Hookbox
http://labs.gameclosure.com/hookbox/







Chang Choy 郑才师傅 « 郑才王氏武术醒狮团
http://ccwongs.wordpress.com/bibliography/176-2/
意力拳今天的报纸的报道 - 武术论坛 - 佳礼网络社区综合论坛 ~ 马来西亚中文论坛
http://cforum.cari.com.my/viewthread.php?tid=1832615



-end-

Saturday, March 17, 2012

Daily Bookmarks 20120317

Redis at Disqus | Brett Hoerner's blog
http://bretthoerner.com/2011/2/21/redis-at-disqus/
非我发明症(Not Invented Here (NIH) Syndrome) - 旁观的细节 - 博客大巴
http://anythingbut.blogbus.com/logs/7794193.html
pyvideo.org - Disqus: Serving 400 million people with Python
http://pyvideo.org/video/418/pycon-2011--disqus--serving-400-million-people-wi



-end-

Thursday, March 15, 2012

Daily Bookmarks 20120315

itertools – Iterator functions for efficient looping - Python Module of the Week
http://www.doughellmann.com/PyMOTW/itertools/
Python itertools grouper with truncation
http://log.bthomson.com/2011/01/python-itertools-grouper-with.html
Python: copying a list the right way
http://henry.precheur.org/python/copy_list
life is short - you need Python!: How to copy a list in Python?
http://love-python.blogspot.com/2008/04/how-to-copy-list-in-python.html
python - Pythonic way of copying an iterable object - Stack Overflow
http://stackoverflow.com/questions/3826746/pythonic-way-of-copying-an-iterable-object
How can I use python itertools.groupby() to group a list of strings by their first character? - Stack Overflow
http://stackoverflow.com/questions/2472001/how-can-i-use-python-itertools-groupby-to-group-a-list-of-strings-by-their-fir
可爱的 Python: Python 之优雅与瑕疵,第 1 部分
http://www.ibm.com/developerworks/cn/linux/l-python-elegance-1.html
python - how itertools.tee works, can type 'itertools.tee' be duplicated in order to save it's "status"? - Stack Overflow
http://stackoverflow.com/questions/3957270/how-itertools-tee-works-can-type-itertools-tee-be-duplicated-in-order-to-save
用python解决 Google Treasure Hunt 2008 Question:Primes - prettyinsight的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/prettyinsight/article/details/5038040
俄羅斯免費VPS - ihc.ru - 0.618網絡空間 - IHC.ru是一家俄羅斯的主機商,這是一家綜合性網絡服務提供商,提供產品包括虛擬主機,VPS,獨立服務器,域名註冊等。IHC.ru提供的任一款VPS,都支持免費10天測試,你先在線...0.618網絡空間
http://0618.us/russia-free-vps-ihc_ru/
dotcloud | 陋室博客
http://bolg.malu.me/html/tag/dotcloud
試用 dotcloud
http://gugod.org/2011/05/-dotcloud/
iGFW » 使用dotcloud免费ssh翻墙
http://igfw.net/archives/7257
在 Dotcloud 上架设 Django 网站 - python 主机 - python.cn(news, jobs)
http://simple-is-better.com/news/378
dotcloud | Yangtse's Blog
http://blog.yangtse.me/tag/dotcloud/
如何在 Dotcloud 上部署 flask 应用 - Flask - python.cn(news, jobs)
http://simple-is-better.com/news/317
WordPress on dotCloud | Blue
http://www.kdblue.com/2011/07/wordpress-dotcloud/
用 Python 建英语单词表 | Blue
http://www.kdblue.com/2010/10/build-word-list-using-python/
用dotCloud做域名转发_国外主机评测网_789主机Bus
http://webcache.googleusercontent.com/search?q=cache:Q5KFptzOgEQJ:789bus.com/domain/20111115/16227.html+&cd=32&hl=zh-TW&ct=clnk&lr=lang_zh-CN%7Clang_zh-TW&client=firefox-a




-end-

Wednesday, March 14, 2012

Tuesday, March 13, 2012

Daily Bookmarks 20120312

HyperDex学习笔记 | 徐明明
http://xumingming.sinaapp.com/785/hyperdex-notes/
硅谷称之为“常识”的网站设计过程原则 - 开源中国社区
http://www.oschina.net/news/26676/how-to-design-for-normals
可搜索的分布式Key-Value存储 HyperDex - OPEN开发经验库
http://www.open-open.com/lib/view/open1330059765093.html
Twitter / @zizon: 简单扫了眼hyperdex的论文. 所谓的hype ...
https://twitter.com/#!/zizon/statuses/173063018853302272
Play社区|HyperDex: A Searchable Distributed Key-Value Store
http://www.playframework.me/topic/4f48f5e9dde7dfd1590045ac
CityHash | 谷奥——探寻谷歌的奥秘
http://www.guao.hk/tag/cityhash

-end-

Friday, March 09, 2012

Daily Bookmarks 20120309

在GAE的数据库中实现多对多的关系
http://www.keakon.net/2009/03/01/%E5%9C%A8GAE%E7%9A%84%E6%95%B0%E6%8D%AE%E5%BA%93%E4%B8%AD%E5%AE%9E%E7%8E%B0%E5%A4%9A%E5%AF%B9%E5%A4%9A%E7%9A%84%E5%85%B3%E7%B3%BB
Google App Engine的入门视频
http://www.keakon.net/2009/03/12/GoogleAppEngine%E7%9A%84%E5%85%A5%E9%97%A8%E8%A7%86%E9%A2%91
原来Google App Engine也使用了秒传技术
http://www.keakon.net/2009/03/17/%E5%8E%9F%E6%9D%A5GoogleAppEngine%E4%B9%9F%E4%BD%BF%E7%94%A8%E4%BA%86%E7%A7%92%E4%BC%A0%E6%8A%80%E6%9C%AF
使用Sharding Counters技术提升GAE的计数性能
http://www.keakon.net/2009/03/17/%E4%BD%BF%E7%94%A8ShardingCounters%E6%8A%80%E6%9C%AF%E6%8F%90%E5%8D%87GAE%E7%9A%84%E8%AE%A1%E6%95%B0%E6%80%A7%E8%83%BD
不要以关系数据库的观点来使用GAE的BigTable
http://www.keakon.net/2009/03/31/%E4%B8%8D%E8%A6%81%E4%BB%A5%E5%85%B3%E7%B3%BB%E6%95%B0%E6%8D%AE%E5%BA%93%E7%9A%84%E8%A7%82%E7%82%B9%E6%9D%A5%E4%BD%BF%E7%94%A8GAE%E7%9A%84BigTable
使用Decorator让数据自动存储在GAE的memcache里
http://www.keakon.net/2009/04/12/%E4%BD%BF%E7%94%A8Decorator%E8%AE%A9%E6%95%B0%E6%8D%AE%E8%87%AA%E5%8A%A8%E5%AD%98%E5%82%A8%E5%9C%A8GAE%E7%9A%84memcache%E9%87%8C
让GAE支持中文URL处理
http://www.keakon.net/2009/04/14/%E8%AE%A9GAE%E6%94%AF%E6%8C%81%E4%B8%AD%E6%96%87URL%E5%A4%84%E7%90%86
Paging through large datasets - Google App Engine - Google Code
http://code.google.com/intl/en/appengine/articles/paging.html
Powered By ~ Sarath Chandra Pandurangi's Blog
http://blog.sarathonline.com/2009/02/powered-by.html
Decorator for Memcache Get/Set in python ~ Sarath Chandra Pandurangi's Blog
http://blog.sarathonline.com/2009/02/decorator-for-memcache-getset-in-python.html
用GAE+jQuery打造无需数据库的AJAX聊天室
http://www.keakon.net/2009/04/22/%E7%94%A8GAE+jQuery%E6%89%93%E9%80%A0%E6%97%A0%E9%9C%80%E6%95%B0%E6%8D%AE%E5%BA%93%E7%9A%84AJAX%E8%81%8A%E5%A4%A9%E5%AE%A4
一个用于GAE数据库分页的模块
http://www.keakon.net/2009/05/28/%E4%B8%80%E4%B8%AA%E7%94%A8%E4%BA%8EGAE%E6%95%B0%E6%8D%AE%E5%BA%93%E5%88%86%E9%A1%B5%E7%9A%84%E6%A8%A1%E5%9D%97
Move out from AppEngine, and Python PaaS alternatives
http://www.slideshare.net/tzangms/move-out-from-appengine
InfoQ: Google App Engine正式支持Python 2.7
http://www.infoq.com/cn/news/2012/03/gae-python27







think
知識怎麼有效率的累積~

-end-

Wednesday, March 07, 2012

Daily Bookmarks 20120307

python itertools and groupby | Pietro Abate homepage
http://mancoosi.org/~abate/python-itertools-and-groupby
5. Expressions — Python v2.7.2 documentation
http://docs.python.org/reference/expressions.html
Python: Lambda Functions
http://www.secnetix.de/olli/Python/lambda_functions.hawk
9.7. itertools — Functions creating iterators for efficient looping — Python v2.7.2 documentation
http://docs.python.org/library/itertools.html#itertools.groupby
iteration - How do I use Python's itertools.groupby()? - Stack Overflow
http://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby
馬克雜想...: python怎麼sort dictionary?
http://macwang.blogspot.com/2010/09/pythonsort-dictionary.html
iteration - How do I use Python's itertools.groupby()? - Stack Overflow
http://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby
iteration - How do I use Python's itertools.groupby()? - Stack Overflow
http://stackoverflow.com/questions/773/how-do-i-use-pythons-itertools-groupby
longest common substring « demonstrate 的 blog
http://remonstrate.wordpress.com/tag/longest-common-substring/
ruby - All Common Subsequences in Array of Strings - Stack Overflow
http://stackoverflow.com/questions/8148417/all-common-subsequences-in-array-of-strings
Find elements of a list that contain substrings from another list in Python - Stack Overflow
http://stackoverflow.com/questions/6828636/find-elements-of-a-list-that-contain-substrings-from-another-list-in-python
groupbyhead: Group a list of items according to the starting character(s) of items. « Python recipes « ActiveState Code
http://code.activestate.com/recipes/465830-groupbyhead-group-a-list-of-items-according-to-the/









-end-

Tuesday, March 06, 2012

Daily Bookmarks 20120306

Elepath, Inc.
http://elepath.com/an-introduction-to-elepath
申请了webfaction空间后,做些什么? | 宇宙尽头的餐馆 - 大道直如发,春日佳气多。五陵贵公子,双双鸣玉珂。
http://blog.spikeyang.com/?p=25
Lucene中文分词 | Jeff的妙想奇境
http://www.jeffkit.info/2007/10/622/
WebFaction 與 Django 而且 D 不發音 | 科學的愛情
http://vinta.ws/lambda/13
Webfaction - Oceanic / 人生海海
http://tzangms.com/blog/2412/
photoimage you always 搬新家 | photo you always
http://blog.1798.in/blog/photoimage-you-always-%e6%90%ac%e6%96%b0%e5%ae%b6/
How To Build A Web App in Four Days For $10,000 (Say Hello To Matt) | TechCrunch
http://techcrunch.com/2008/07/03/how-to-build-a-web-app-in-four-days-for-10000-say-hello-to-matt/
四天内制作一个网站的四点纪要 | My Open Course Ware
http://www.mocw.cn/?p=284
美國人就是要猴急4天趕完一個網站?「太急著想知道」原來是兩面刃--DoNews.com--IT社区&写作平台ii
http://home.donews.com/donews/article/1/126713.html
Python:縮排,不然就去死 | 科學的愛情
http://vinta.ws/lambda/12
Facebook Graph API 認證機制 | 科學的愛情
http://vinta.ws/lambda/16



-end-

Monday, March 05, 2012

Daily Bookmarks 20120305

python - Longest Prefix Matches for URLs - Stack Overflow
http://stackoverflow.com/questions/5434813/longest-prefix-matches-for-urls
Python: Determine prefix from a set of (similar) strings - Stack Overflow
http://stackoverflow.com/questions/6718196/python-determine-prefix-from-a-set-of-similar-strings
Grouping strings by prefixes - Python | DaniWeb to solve group problem
http://www.daniweb.com/software-development/python/code/217159
Improving a fuzzy matching algorithm - Python - Stack Overflow
http://stackoverflow.com/questions/5043847/improving-a-fuzzy-matching-algorithm-python
pylevenshtein - A fast implementation of Levenshtein Distance (and others) for Python - Google Project Hosting
http://code.google.com/p/pylevenshtein/
Find the common beginning in a list of strings « Python recipes « ActiveState Code
http://code.activestate.com/recipes/252177-find-the-common-beginning-in-a-list-of-strings/

Saturday, March 03, 2012

Friday, March 02, 2012

Daily Bookmarks 20120302

搜索引擎索引之如何建立索引 - malefactor's 布拉格 - 博客频道 - CSDN.NET
http://blog.csdn.net/malefactor/article/details/7299933
后缀树【Suffix Tree】 - TsengYuen的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/TsengYuen/article/details/4815921
是工作也是休閒: LCS 最長共同子序列(未完待續)
http://jerrywell.blogspot.com/2010/11/lcs.html

-end-

Thursday, March 01, 2012

Daily Bookmarks 20120301



tiny search engine hash table
http://www.cs.dartmouth.edu/~campbell/cs23/datastructures.html
Filesonic, trying to figure out their URL design.
http://www.woodmann.com/forum/showthread.php?14416-Filesonic-trying-to-figure-out-their-URL-design
Upload Images to Imgur with Ruby | Lance Pollard
http://code.lancepollard.com/upload-images-to-imgur-with-ruby
How to find the original Flickr Photo URL and User from a Static Flickr Image URL/Permalink (My priceless Flickr Tip) | Bram.us
http://www.bram.us/2008/01/12/my-priceless-flickr-tip-how-to-find-the-original-flickr-photo-url-and-user-from-a-static-flickr-image-url/
1.6 image storage - MediaWiki
http://www.mediawiki.org/wiki/1.6_image_storage
The Imgur API - Anonymous Resources
http://api.imgur.com/resources_anon

December « 2011 « Endlessly Curious python
http://www.endlesslycurious.com/2011/12/
URL Shortener und Redirects
http://hjacob.com/blog/2009/07/url-shortener-redirects/
» The Small-Scale Approach to Achieving Great Things :zenhabits
http://zenhabits.net/small-scale/
dictmatch及多模算法串讲(一) - 百度互联网技术官方博客 - 博客频道 - CSDN.NET
http://blog.csdn.net/baiduforum/article/details/5436902
小池與安安靜靜...的部落格: Map/Reduce Sorting
http://kalin.myfs.cc/2011/03/mapreduce-sorting.html
String algorithm suggestion to find all the common prefixes of a list of strings - Stack Overflow
http://stackoverflow.com/questions/6634480/string-algorithm-suggestion-to-find-all-the-common-prefixes-of-a-list-of-strings
The Most Efficient Algorithm to Find First Prefix-Match From a Sorted String Array? - Stack Overflow
http://stackoverflow.com/questions/457160/the-most-efficient-algorithm-to-find-first-prefix-match-from-a-sorted-string-arr
performance - finding long repeated substrings in a massive string - Stack Overflow
http://stackoverflow.com/questions/398811/finding-long-repeated-substrings-in-a-massive-string
Longest common subsequence good
http://wordaligned.org/articles/longest-common-subsequence

后缀数组_百度百科
http://baike.baidu.com/view/1240197.htm
dictmatch及多模算法串讲(一) - 百度互联网技术官方博客 - 博客频道 - CSDN.NET
http://blog.csdn.net/baiduforum/article/details/5436902
What is the easiest way to find the longest common prefix or suffix of two sequences in Python? - Quora
http://www.quora.com/What-is-the-easiest-way-to-find-the-longest-common-prefix-or-suffix-of-two-sequences-in-Python
Find the common beginning in a list of strings « Python recipes « ActiveState Code
http://code.activestate.com/recipes/252177-find-the-common-beginning-in-a-list-of-strings/
Idle Time » Blog Archive » Finding the longest common prefix of an array of strings in Python, part 2
http://boredzo.org/blog/archives/2007-01-06/longest-common-prefix-in-python-2
Python: Determine prefix from a set of (similar) strings - Stack Overflow
http://stackoverflow.com/questions/6718196/python-determine-prefix-from-a-set-of-similar-strings
后缀树【Suffix Tree】 - TsengYuen的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/TsengYuen/article/details/4815921
Trie in Python | 我爱正则表达式
http://iregex.org/blog/trie-in-python.html
C語言考古題 & C的解題 -- 程式設計學習入門: Problem 10405 Longest Common Subsequence,最長共同子字串
http://using-c.blogspot.com/2010/07/problem-10405-longest-common.html




-end-