Thursday, May 31, 2012

Daily Bookmarks 20120531

jQuery: How to count the number of elements which "display" isn't "none"? - Stack Overflow
http://stackoverflow.com/questions/5325030/jquery-how-to-count-the-number-of-elements-which-display-isnt-none
jQuery count div that has a display:none attribute - Stack Overflow
http://stackoverflow.com/questions/2327312/jquery-count-div-that-has-a-displaynone-attribute
css - Jquery count number of hidden elements within div - Stack Overflow
http://stackoverflow.com/questions/1295956/jquery-count-number-of-hidden-elements-within-div
javascript - jQuery toggle show/hide elements after certain number of matching elements - Stack Overflow nice
http://stackoverflow.com/questions/2411588/jquery-toggle-show-hide-elements-after-certain-number-of-matching-elements
Understanding "event handlers" in JavaScript- onLoad Event handlers
http://www.javascriptkit.com/javatutors/event3.shtml

html - Jquery: Adding a class to even an odd List-Elements - Stack Overflow
http://stackoverflow.com/questions/1171179/jquery-adding-a-class-to-even-an-odd-list-elements
Checkbox filters with jQuery | Ask the CSS Guy
http://askthecssguy.com/articles/checkbox-filters-with-jquery/
35 Useful jQuery Filter and Sort Plugins
http://www.tripwiremagazine.com/2012/05/jquery-filter-sort-plugins.html
javascript - jQuery multiple checkbox filters - Stack Overflow
http://stackoverflow.com/questions/1125767/jquery-multiple-checkbox-filters

Practical Data Analysis in Python
http://www.slideshare.net/hmason/practical-data-analysis-in-python
Text Classification for Sentiment Analysis – Eliminate Low Information Features | StreamHacker
http://streamhacker.com/2010/06/16/text-classification-sentiment-analysis-eliminate-low-information-features/
Text Classification for Sentiment Analysis – Naive Bayes Classifier | StreamHacker
http://streamhacker.com/2010/05/10/text-classification-sentiment-analysis-naive-bayes-classifier/










-end-

Wednesday, May 30, 2012

Daily Bookmarks 20120530

可伸缩架构常用技术--数据切分 - MongoDB,数据切分,水平切分,垂直切分 - Java - ITeye论坛
http://www.iteye.com/topic/1119986
Instagram 的ID生成策略[翻译] - - ITeye技术网站
http://yjl49.iteye.com/blog/1522773
High Scalability - High Scalability - How Twitter Stores 250 Million Tweets a Day Using MySQL
http://webcache.googleusercontent.com/search?q=cache:RTO5Pon-OvcJ:highscalability.com/blog/2011/12/19/how-twitter-stores-250-million-tweets-a-day-using-mysql.html+&cd=9&hl=zh-TW&ct=clnk&client=firefox
Instagram Engineering • Sharding & IDs at Instagram
http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram
Django | scot hacker's foobar blog
http://birdhouse.org/blog/tag/django/
sql - What would be the right steps for horizontal partitioning in Postgresql? - Stack Overflow
http://stackoverflow.com/questions/10256923/what-would-be-the-right-steps-for-horizontal-partitioning-in-postgresql
Instagram 的ID生成策略[翻译] - - ITeye技术网站
http://yjl49.iteye.com/blog/1522773


Extragr.am:体验很棒的Instagr.am网页客户端|天涯海阁|Web2.0Share
http://www.web20share.com/2011/04/extragram-instagram-web.html
Instagr.am稱Android版某程度上比iPhone更出色 - winandmac.com香港版
http://www.winandmac.com/2012/03/instagram-said-their-upcoming-android-app-could-be-better-than-the-iphone-version/

短域名/URL Shortening/Base36/Base62 - 打天打鸭 - ITeye技术网站
http://gembler.iteye.com/blog/664157

Instagr.am Picture Previews with Ruby « Jason Neylon's Blog
http://jasonneylon.wordpress.com/2011/02/13/instagr-am-picture-previews-with-ruby/
使用Instagram API showcode to  media id
http://darasion.appspot.com/2011/02/18/Instagram-API
How to get media-id from shortcode/url. - Instagram API Developers | Google 網上論壇
http://groups.google.com/group/instagram-api-developers/browse_thread/thread/ed05d994e1d35776
Instagram Engineering • Storing hundreds of millions of simple key-value pairs in Redis
http://instagram-engineering.tumblr.com/post/12202313862/storing-hundreds-of-millions-of-simple-key-value-pairs
Use a zipfile store a dict like k-v database — Gist
https://gist.github.com/2311203
How to get media-id from shortcode/url. - Instagram API Developers | Google 網上論壇
http://groups.google.com/group/instagram-api-developers/browse_thread/thread/ed05d994e1d35776
http://api.instagram.com/oembed?url=http://instagr.am/p/BUG/
http://api.instagram.com/oembed?url=http://instagr.am/p/BUG/



jQuery price slider filter - Stack Overflow
http://stackoverflow.com/questions/10787027/jquery-price-slider-filter
jquery ui slider double slider filtering - Stack Overflow
http://stackoverflow.com/questions/10299703/jquery-ui-slider-double-slider-filtering
jQuery+CSS使用滑块选取价格范围-Helloweba-致力于WEB前端技术在中国的应用
http://www.helloweba.com/view-search-146.html
Kayak-like filter sliders using jQuery and AJAX pagination in CakePHP | nuts and bolts of cakephp
http://nuts-and-bolts-of-cakephp.com/2009/01/14/kayak-like-filter-sliders-using-jquery-and-ajax-pagination-in-cakephp/













-end-

Tuesday, May 29, 2012

Daily Bookmarks 20120529

How to Protect your Eyes from Continuous Computer Operation
http://www.share2others.com/health/how-to-protect-your-eyes-at-computer-work/861/
Coding in Dreams [学习笔记] Probabilistic Latent Semantic Analysis (pLSA)
http://blog.tomtung.com/2011/10/plsa
Coding in Dreams  [学习笔记] Expectation-Maximization(EM) 算法
http://blog.tomtung.com/2011/10/em-algorithm/
搜索引擎市场竞争加剧,创业公司Helioid推出分类搜索引擎专注学术市场 | 36氪
http://www.36kr.com/p/64908.html
[BetterExplained]为什么你应该(从现在开始就)写博客
http://mindhacks.cn/2009/02/15/why-you-should-start-blogging-now/
在Django网站中整合MailChimp服务 - Herock Post
http://herockpost.com/2012/05/django-mailchimp.html
同行
http://itongxing.com/
送一本书:小强升职记 - Herock Post
http://herockpost.com/2009/04/gtd_book.html
SWODE:个性化门户网站 - Herock Post
http://herockpost.com/2007/07/swode.html
GTD要简单高效 - Herock Post
http://herockpost.com/2007/09/gtd.html
ThinkingRock:最好的GTD软件 - 褪墨
http://www.mifengtd.cn/articles/thinkingrock_overview.html











-end-

Monday, May 28, 2012

Daily Bookmarks 20120528

Coding Horror: URL Shortening: Hashes In Practice
http://www.codinghorror.com/blog/2007/08/url-shortening-hashes-in-practice.html
High Scalability - High Scalability - YouTube Architecture
http://highscalability.com/youtube-architecture
Generate Random Strings Using PHP | Nine-One-One... Need Code, Help!
http://911-need-code-help.blogspot.com/2009/06/generate-random-strings-using-php.html
理论计算机初步:从hash函数到王小云的MD5破解 « 阅微堂
http://zhiqiang.org/blog/science/computer-science/preliminary-computer-theory-xiao-yun-wang-from-the-hash-function-to-crack-md5.html
How Intel is Solving the Problems with Random Number Generation - Tested
http://www.tested.com/news/2829-how-intel-is-solving-the-problems-with-random-number-generation/
石頭閒語:在 C 程式中使用 MD5 library 及其應用 - 樂多日誌
http://blog.roodo.com/rocksaying/archives/3873017.html
hash - MD5 and SHA-2 collisions in Python - Stack Overflow
http://stackoverflow.com/questions/5787471/md5-and-sha-2-collisions-in-python
git 101 – git的物件模型 | Ricky's murmur...
http://jcliang.twgogo.org/267/git-101-git%E7%9A%84%E7%89%A9%E4%BB%B6%E6%A8%A1%E5%9E%8B
How do I create unique IDs, like YouTube? | LinkedIn
http://www.linkedin.com/groups/How-do-I-create-unique-40870.S.70870735
Flickr: Discussing manufacturing flic.kr style photo URLs in Flickr API
http://www.flickr.com/groups/api/discuss/72157616713786392/
YouTube URL algorithm? - Stack Overflow
http://stackoverflow.com/questions/3034861/youtube-url-algorithm
Coding Horror: URL Shortening: Hashes In Practice
http://www.codinghorror.com/blog/2007/08/url-shortening-hashes-in-practice.html
Are YouTube codes guaranteed to always be 11 characters? - Web Applications
http://webapps.stackexchange.com/questions/13854/are-youtube-codes-guaranteed-to-always-be-11-characters


image md5 filename
Database versus files for Images | dave dash
http://davedash.com/2009/02/18/database-versus-files-for-images/
Resizing Image on upload in Django | dave dash
http://davedash.com/2009/02/21/resizing-image-on-upload-in-django/
filenames - Names of image files on a website: hashes or random strings? - Stack Overflow
http://stackoverflow.com/questions/9478680/names-of-image-files-on-a-website-hashes-or-random-strings
Duplicate file/image checking with PHP and MD5 - webdevRefinery Forum
http://webdevrefinery.com/forums/topic/6560-duplicate-fileimage-checking-with-php-and-md5/
storage - Storing a million images in the filesystem - Server Fault
http://serverfault.com/questions/95444/storing-a-million-images-in-the-filesystem
file structure - Convert filenames to their checksum before saving to prevent duplicates. Is is a smart thing to do? - Programmers
http://programmers.stackexchange.com/questions/86921/convert-filenames-to-their-checksum-before-saving-to-prevent-duplicates-is-is-a
hashing - A PHP hash function with a long output length? - Stack Overflow
http://stackoverflow.com/questions/295515/a-php-hash-function-with-a-long-output-length
Coding Horror: Hashtables, Pigeonholes, and Birthdays
http://www.codinghorror.com/blog/2007/12/hashtables-pigeonholes-and-birthdays.html






store image file name md5 - Google 搜尋
https://www.google.com/search?q=store+image+file+name+md5&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox














-end-

Saturday, May 26, 2012

Daily Bookmarks 20120526

Frighteningly Ambitious Startup Ideas
http://paulgraham.com/ambitious.html
Helioid » Paul Graham’s “Frighteningly Ambitious” Ideas
http://blog.helioid.com/2012/03/paul-grahams-frighteningly-ambitious-ideas/
Helioid’s Search Engine Provides Category Sorting To Aid Research, Targets Students And Professionals | TechCrunch
http://techcrunch.com/2011/12/02/helioid-categories/
Pairwise Document Similarity in Large Collections with MapReduce - 0.028669 seconds
http://blog.ring.idv.tw/comment.ser?i=310
[Java] Pairwise Vector Similarity by Cosine Similarity @ Hadoop 0.20.1 @ 第二十四個夏天後 :: 痞客邦 PIXNET :: 與上一篇相互對照
http://changyy.pixnet.net/blog/post/25551807-%5Bjava%5D-pairwise-vector-similarity-by-cosine-similarity-@-had
Pairwise Document Similarity - Google 搜尋
https://www.google.com/search?q=Pairwise+Document+Similarity&hl=zh-TW&client=firefox-a&hs=hns&rls=org.mozilla:zh-TW:official&prmd=imvns&source=lnt&tbs=lr:lang_1zh-CN%7Clang_1zh-TW&lr=lang_zh-CN%7Clang_zh-TW&sa=X&ei=EJPAT4SDCJHnmAX3p5iiCg&ved=0CFoQpwUoAQ&biw=1130&bih=622
Result Diversity - Google 搜尋
https://www.google.com/search?q=Result+Diversity&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox-a
A simple Distributed Hash Table (DHT) « / Reza
http://rezahok.wordpress.com/2009/09/21/a-simple-distributed-hash-table-dht/
Prefix Hash Tree_我思故我在_百度空间
http://hi.baidu.com/rodimus/blog/item/6929774fc2d9e738afc3abe2.html


Helioid
Helioid » Literature Review 2008 – 2009
http://blog.helioid.com/2009/11/literature-review-2008-2009/
Helioid » Search Refinement (and Helioid) is the Top Tech Trend Video look
http://blog.helioid.com/2011/09/search-refinement-and-helioid-is-the-top-tech-trend/
Helioid’s Search Engine Provides Category Sorting To Aid Research, Targets Students And Professionals | TechCrunch
http://techcrunch.com/2011/12/02/helioid-categories/

Linux-HA开源软件Heartbeat(安装篇) - 技术成就梦想 - 51CTO技术博客
http://ixdba.blog.51cto.com/2895551/547778
如何让团队更高效 - 小菜_默 - 51CTO技术博客
http://riverdream.blog.51cto.com/1559152/876353
工作日志2007年,一 - srsunbing - 51CTO技术博客
http://srsunbing.blog.51cto.com/3221858/876255
LINE Storage: Storing billions of rows in Sharded-Redis and HBase per Month « NAVER Engineers' Blog
http://tech.naver.jp/blog/?p=1420

term frequency/inverse document frequency (TFIDF) « Infomotions Mini-Musings
http://infomotions.com/blog/tag/term-frequencyinverse-document-frequency-tfidf/



Paper
Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce - Google 搜尋
https://www.google.com/search?q=Brute+force+and+indexed+approaches+to+pairwise+document+similarity+comparisons+with+MapReduce&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:zh-TW:official&client=firefox-a

Hashed Patricia Trie: Efficient Longest Prefix Matching in Peer-to ...

www.cs.uni-paderborn.de/.../HashedPatriciaTrie.pdf

自動分類
中文文件自動分類之研究
http://cdp.sinica.edu.tw/paper/1993/19930601_1.htm



-end-

Thursday, May 24, 2012

Daily Bookmarks 20120524

Ask HN: how can I generate youtube style id? | Hacker News
http://news.ycombinator.com/item?id=485423
How do I find the unique video ID in new sharing URL? - Google 網上論壇
http://productforums.google.com/forum/#!topic/youtube/r3zYlqEmTcc
Create short IDs with PHP - Like Youtube or TinyURL
http://kevin.vanzonneveld.net/techblog/article/create_short_ids_with_php_like_youtube_or_tinyurl/
php5 - PHP: Short id like Youtube, with salt - Stack Overflow
http://stackoverflow.com/questions/4153628/php-short-id-like-youtube-with-salt
Base58 Encode and Decode using PHP with example; base58_encode(), base58_decode() « Dark Launch
http://darklaunch.com/2009/08/07/base58-encode-and-decode-using-php-with-example-base58-encode-base58-decode
Flickr: Discussing manufacturing flic.kr style photo URLs in Flickr API
http://www.flickr.com/groups/api/discuss/72157616713786392/
YouTube: Does your video ID system really work? | ZDNet
http://www.zdnet.com/blog/btl/youtube-does-your-video-id-system-really-work/10689
Useful YouTube URL Tricks
http://www.techairlines.com/2010/08/21/useful-youtube-url-tricks/
what's the youtube video id maximum length? - Stack Overflow
http://stackoverflow.com/questions/6180138/whats-the-youtube-video-id-maximum-length
c# - YouTube-like GUID - Stack Overflow
http://stackoverflow.com/questions/1458468/youtube-like-guid
YouTube Co-Founder Explains New Video ID System | Epicenter | Wired.com
http://www.wired.com/epicenter/2007/06/youtube-co-founder-explains-new-video-id-system/
YouTube Dataset
http://netsg.cs.sfu.ca/youtubedata/
Tim Wu's notes: dropbox storage techniques leaked by dropship
http://changtimwu.blogspot.com/2011/04/dropbox-storage-techniques-leaked-by.html
From the README, in case it wasn't obvious:"These utilities make use of the dedu... | Hacker News
http://news.ycombinator.com/item?id=2478610

石頭閒語:在 C 程式中使用 MD5 library 及其應用 - 樂多日誌
http://blog.roodo.com/rocksaying/archives/3873017.html

Yet Another Chris - C# and .NET - Latest posts - Friendly Unique Id Generation Part 1
http://www.yetanotherchris.me/home/2009/3/3/friendly-unique-id-generation-part-1.html#base64

YouTube/ TinyURL style unique id strings?
http://www.sitepoint.com/forums/showthread.php?425226-YouTube-TinyURL-style-unique-id-strings
From the README, in case it wasn't obvious:"These utilities make use of the dedu... | Hacker News
http://news.ycombinator.com/item?id=2478610
MD5 hashing 4-byte and 8-byte keys into 16-byte values; what's the chance of a collision? - Stack Overflow
http://stackoverflow.com/questions/4899403/md5-hashing-4-byte-and-8-byte-keys-into-16-byte-values-whats-the-chance-of-a-c
Flickr: Discussing manufacturing flic.kr style photo URLs in Flickr API
http://www.flickr.com/groups/api/discuss/72157616713786392/
Create short IDs with PHP - Like Youtube or TinyURL
http://kevin.vanzonneveld.net/techblog/article/create_short_ids_with_php_like_youtube_or_tinyurl/






















Counting YouTube Videos via Random Prefix Sampling

www-users.cs.umn.edu/~yanhua/Includes/imc091s.pdf - 翻譯這個網頁
檔案類型: PDF/Adobe Acrobat - 快速檢視

-end-

Monday, May 21, 2012

Daily Bookmarks 20120520

Code: Flickr Developer Blog » Ticket Servers: Distributed Unique Primary Keys on the Cheap
http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/
Flickr Authorization
http://mashupguide.net/1.0/html/ch06s07.xhtml
What is the design and architecture of Instagr.am's short URLs? - Quora
http://www.quora.com/What-is-the-design-and-architecture-of-Instagr-ams-short-URLs

sullof/shardjs design short url Good~~~
https://github.com/sullof/shardjs
Django snippets: A custom URL shortening view, for use with rev=canonical
http://djangosnippets.org/snippets/1430/
URL Shortener Web Application Using Django
http://nileshk.com/2009/06/02/url-shortener-web-app-using-django.html
Create your own simple short URL generator website « Kaos Coder
http://blog.kaosaelee.com/2011/05/05/create-your-own-simple-short-link-generator-website/
base62编码在微博短链接中的应用 | CodeTick | 关注各类编程技巧
http://www.code-trick.com/base62-php/
rev=canonical bookmarklet and designing shorter URLs
http://simonwillison.net/2009/apr/11/revcanonical/
反新浪的生成短网址 « 一路走来……
http://blog.xhbin.com/archives/1068
Django snippets: A custom URL shortening view, for use with rev=canonical
http://djangosnippets.org/snippets/1430/
vikasing: A Simple URL Shortening Algorithm in JAVA
http://www.vikasing.com/2010/11/simple-url-shortening-algorithm-in-java.html
How to make unique short URL with Python? - Stack Overflow
http://stackoverflow.com/questions/1497504/how-to-make-unique-short-url-with-python
language agnostic - Tinyurl-style unique code: potential algorithm to prevent collisions - Stack Overflow
http://stackoverflow.com/questions/1257825/tinyurl-style-unique-code-potential-algorithm-to-prevent-collisions
algorithm - Map incrementing integer range to six-digit base 26 max, but unpredictably - Stack Overflow
http://stackoverflow.com/questions/1051949/map-incrementing-integer-range-to-six-digit-base-26-max-but-unpredictably/1052896#1052896
What is the design and architecture of Instagr.am's short URLs? - Quora
http://www.quora.com/What-is-the-design-and-architecture-of-Instagr-ams-short-URLs
















-end-

Thursday, May 17, 2012

Daily Bookmarks 20120516

Fuzzy matching/chunking algorithm - Stack Overflow
http://stackoverflow.com/questions/5122527/fuzzy-matching-chunking-algorithm
9.2.7 Numeric Comparison with Percentage Tolerance  'FieldComparatorNumericPerc'
http://cs.anu.edu.au/~Peter.Christen/Febrl/febrl-0.3/febrldoc-0.3/node41.html
]  n-Gram/2L-approximation: a two-level n-gram inverted index ... http://infolab.dgist.ac.kr/~mskim/papers/CSSE07.pdf
CiteSeerX — A Practical q-Gram Index for Text Retrieval Allowing Errors
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.2942
Go together. - Wagn nice app RoR
http://wagn.org/
k-gram indexes for wildcard queries
http://nlp.stanford.edu/IR-book/html/htmledition/k-gram-indexes-for-wildcard-queries-1.html
Group-average agglomerative clustering
http://nlp.stanford.edu/IR-book/html/htmledition/group-average-agglomerative-clustering-1.html
轮排主题索引(Permuterm Subject Index)
http://dxyw.hep.com.cn:8080/downloads/%E4%BF%A1%E6%81%AF%E6%A3%80%E7%B4%A2%EF%BC%88%E5%A4%9A%E5%AA%92%E4%BD%93%EF%BC%89%E6%95%99%E7%A8%8B%EF%BC%88%E7%AC%AC%E4%BA%8C%E7%89%88%EF%BC%89/chap7/pages/7_2_1_2_5.html
Permuterm index for cs707_011712
http://www.scribd.com/doc/78793611/13/Permuterm-index
Building A Python-Based Search Engine — PyCon2012 Schedule & Notes 1.0 documentation
http://andrew-schoen-pycon-2012-notes.readthedocs.org/en/latest/sunday/session_2.html

bi gram index python - Google 搜尋
https://www.google.com/search?q=bi+gram+index+python&hl=zh-TW&client=firefox-a&hs=Ip3&rls=org.mozilla:zh-TW:official&prmd=imvns&ei=L-OzT_WBJa3mmAW-h-ScBQ&start=20&sa=N&biw=1132&bih=597
Indexing Text in Python
http://vermeulen.ca/python-indexing.html
Faceting — Haystack 2.0.0-beta documentation
http://django-haystack.readthedocs.org/en/latest/faceting.html
Sites Using Haystack — Haystack 2.0.0-beta documentation
http://django-haystack.readthedocs.org/en/latest/who_uses.html
term frequency/inverse document frequency (TFIDF) « Infomotions Mini-Musings
http://infomotions.com/blog/tag/term-frequencyinverse-document-frequency-tfidf/
Automatic metadata generation « Infomotions Mini-Musings
http://infomotions.com/blog/2009/07/automatic-metadata-generation/





谈谈BM25评分 - summerbell - ITeye技术网站
http://summerbell.iteye.com/blog/420084
BM25算法浅析 - iPie : 思维碎片
http://ipie.blogbus.com/logs/104136815.html
Project2--Lucene的Ranking算法修改:BM25算法 - wbia2010lkl的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/wbia2010lkl/article/details/6046661
利用 Heritrix 构建特定站点爬虫
http://www.ibm.com/developerworks/cn/opensource/os-cn-heritrix/?S_TACT=105AGX52&S_CMP=reg-ccid
搜索引擎内容相关性 | 崔永秀
http://cuiyongxiu.com/201201/01231.html


Patent US7644076 - Clustering strings using N-grams - Google Patents
http://www.google.com/patents/US7644076















-end-

Sunday, May 13, 2012

Daily Bookmarks 20120513


http://hetland.org/coding/python/levenshtein.py
How to map the most "similar" strings from one list to another in python? - Stack Overflow
http://stackoverflow.com/questions/8432799/how-to-map-the-most-similar-strings-from-one-list-to-another-in-python
string - Given two python lists of same length. How to return the best matches of similar values? - Stack Overflow
http://stackoverflow.com/questions/7062340/given-two-python-lists-of-same-length-how-to-return-the-best-matches-of-similar/7093523#7093523
efficient list mapping in python - Stack Overflow
http://stackoverflow.com/questions/2729135/efficient-list-mapping-in-python
Web Intelligence and Data Mining Laboratory: A New Suffix Tree Similarity Measure for Document Clustering
http://web204seminar.blogspot.com/2007/07/new-suffix-tree-similarity-measure-for.html
Calculating similarity between text strings in Python | Simpliplant
http://blog.simpliplant.eu/calculating-similarity-between-text-strings-in-python/
graus.nu: Computing string similarity with TF-IDF and Python
http://graus.nu/thesis/string-similarity-with-tfidf-and-python/
The Digital Standard: Why Fuzzy Hashing is Really Cool
http://thedigitalstandard.blogspot.com/2009/11/why-fuzzy-hashing-is-really-cool.html

科学网—又好又快的检索:Fast Similarity Search - 王靖琰的博文
http://blog.sciencenet.cn/home.php?mod=space&uid=205121&do=blog&id=332347
Mahout学习——K-Means Clustering - Leo Zhang - 博客园
http://www.cnblogs.com/vivounicorn/archive/2011/10/08/2201986.html
漫谈 Clustering (1): k-means « Free Mind
http://blog.pluskid.org/?p=17
Time complexity of HAC
http://nlp.stanford.edu/IR-book/html/htmledition/time-complexity-of-hac-1.html
https://docs.google.com/viewer?a=v&q=cache:6Wbsd84uxSwJ:disi.unitn.it/~p2p/RelatedWork/Matching/aj_recordLinkage_06.pdf+&hl=zh-TW&pid=bl&srcid=ADGEEShuznsaLWqyFLY5nAv0yqaU6Tqkk8VPN71LLCFscCXPy2XTmfznaMmpbUENhgUfxEZi1DJ3xCzrBePCzMXRCsbh4eeboLD68qQTc4ELaIHPGG1vKKxJMAp38lFhpIASfWAwfzqQ&sig=AHIEtbQIfWa9dMu2dbPEC1KVqpRtu683zA
similarity clustering pattern match - Google 搜尋
https://www.google.com/search?q=similarity++clustering+pattern+match&hl=zh-TW&client=firefox-a&rls=org.mozilla:zh-TW:official&prmd=imvns&ei=qdevT8y_BZHImQWSmImUCQ&start=10&sa=N&biw=1130&bih=624










-end-

Daily Bookmarks 20120512

Indexing and searching N-grams — Whoosh 2.4.0 documentation
http://packages.python.org/Whoosh/ngrams.html
Lucid Imagination » Auto-Suggest From Popular Queries Using EdgeNGrams
http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/
Quora’s Technology Examined | Big Fast Blog
http://www.bigfastblog.com/quoras-technology-examined
crr » Fast computation of average Levenshtein distances in python
http://crr.ugent.be/programs-data/fast-computation-of-average-levenshtein-distances-in-python-including-old20
A Simple N-gram Calculator: pyngram « « The Sunjay Times The Sunjay Times
http://times.jayliew.com/2010/05/20/a-simple-n-gram-calculator-pyngram/
SimString - A fast and simple algorithm for approximate string matching/retrieval
http://www.chokkan.org/software/simstring/
document - Simple implementation of N-Gram, tf-idf and Cosine similarity in Python - Stack Overflow
http://stackoverflow.com/questions/2380394/simple-implementation-of-n-gram-tf-idf-and-cosine-similarity-in-python
N-Gram generation with counting in Python | thirumal's blog
http://www.thirumal.in/2012/05/n-gram-generation-with-counting-in.html
The structure of approximate groups « What’s new
http://terrytao.wordpress.com/2011/10/24/the-structure-of-approximate-groups/
Linear approximate groups « What’s new
http://terrytao.wordpress.com/2010/01/27/linear-approximate-groups/
Private Medical Record Linkage with Approximate Matching Good !!!!!
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041434/
rongorongo: Approximate Matches
http://www.cslu.ogi.edu/~sproatr/ror/
algorithm - most efficient way to group search results by string similarity - Stack Overflow
http://stackoverflow.com/questions/9921504/most-efficient-way-to-group-search-results-by-string-similarity
Approximate String Matching Engine (Fuzzy Matching Algorithm and Source Code) | Ask A Data Miner - 75,000+ Members
http://www.kdkeys.net/approximate-string-matching-engine-fuzzy-matching-algorithm-and-source-code/





Simple Universally Unique ID (UUID or GUID) « Python recipes « ActiveState Code
http://code.activestate.com/recipes/213761-simple-universally-unique-id-uuid-or-guid/
Medical record linkage in health information systems by approximate string matching and clustering
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1274322/
Medical record linkage in health information systems by approximate string matching and clustering
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1274322/
php - Why is MD5'ing a UUID not a good idea? - Stack Overflow
http://stackoverflow.com/questions/1293741/why-is-md5ing-a-uuid-not-a-good-idea







On Using Two-Phase Filtering in Indexed Approximate String ...

www.cs.uta.fi/~helmu/pubs/spire01.pdf - 翻譯這個網頁
檔案類型: PDF/Adobe Acrobat - 快速檢視





-end-

Friday, May 11, 2012

Daily Bookmarks 20120511

Filtrify – Tag Filtering jQuery Plugin | Web Resource Source
http://www.webresourcesource.com/filtrify-tag-filtering-jquery-plugin/
GBin1推荐:一个jQuery的超级魔术布局插件 - Isotope
http://www.gbin1.com/technology/jquerynews/20111022jquerypluginisotope/
使用jQuery插件filtrify实现的超酷动态标签分类摩托车新款展示 - gbin1 - 博客园
http://www.cnblogs.com/gbin1/archive/2012/05/03/2480413.html

Python: a Technique to Append String in a Loop
http://xahlee.org/perl-python/python_append_string_in_loop.html
python - Search a list of strings for any sub-string from another list - Stack Overflow
http://stackoverflow.com/questions/749342/search-a-list-of-strings-for-any-sub-string-from-another-list
[Python] string1 = "abcdefghijklmnopqrstuvwxyz" string2 = "111ijklmno1111klmnopqrstuvwxy - Pastebin.com
http://pastebin.com/KR5y01Lx
Can someone help me find the longest common substring using python code? - Yahoo! Answers
http://answers.yahoo.com/question/index?qid=20110405150910AAgc5fW
python - Multiple Sequence Alignment (Longest Common Subsequence)? - Stack Overflow
http://stackoverflow.com/questions/10073577/multiple-sequence-alignment-longest-common-subsequence
后缀树【Suffix Tree】 - TsengYuen的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/TsengYuen/article/details/4815921
hash - Working with suffix trees in python - Stack Overflow
http://stackoverflow.com/questions/10086558/working-with-suffix-trees-in-python
1.1.3 Suffix Trees and Arrays
http://www.cs.sunysb.edu/~algorith/files/suffix-trees.shtml
bramcohen: Longest Common Substring
http://bramcohen.livejournal.com/22069.html
suffix tree « demonstrate 的 blog Good site
https://remonstrate.wordpress.com/2012/01/29/suffix-tree/
Suffix Trees
http://www.allisons.org/ll/AlgDS/Tree/Suffix/
Suffix Tree—后缀树 - 忽若流星 - C++博客
http://www.cppblog.com/yuyang7/archive/2009/03/29/78252.html
















-end-

Thursday, May 10, 2012

Daily Bookmarks 20120510

Git Book 中文版 - 交互式添加
http://gitbook.liuhui998.com/4_4.html
Kayak-like filter sliders using jQuery and AJAX pagination in CakePHP | nuts and bolts of cakephp
http://nuts-and-bolts-of-cakephp.com/2009/01/14/kayak-like-filter-sliders-using-jquery-and-ajax-pagination-in-cakephp/
Magento - shop by price - price range ? - HTML, XHTML, CSS, Design Questions - eCommerce Software for Growth
http://www.magentocommerce.com/boards/viewthread/10213/
jQuery price filter? - LemonStand Forum
http://forum.lemonstandapp.com/topic/2833-jquery-price-filter/




-end-

Wednesday, May 09, 2012

Daily Bookmarks 20120509

javascript - jQuery Lazy Loading - Problem with display:none - Stack Overflow
http://stackoverflow.com/questions/5764006/jquery-lazy-loading-problem-with-displaynone
改造,改造一下:jQuery lazyLoad | 木木木木木
http://immmmm.com/transformation-jquery-lazyload.html
Floatutorial: Step by step CSS float tutorial
http://css.maxdesign.com.au/floatutorial/
CSS 浮动
http://www.w3school.com.cn/css/css_positioning_floating.asp




-end-

Daily Bookmarks 20120508

18.2. 使用 timeit 模块
http://woodpecker.org.cn/diveintopython/performance_tuning/timeit.html
timeit – Time the execution of small bits of Python code. - Python Module of the Week
http://www.doughellmann.com/PyMOTW/timeit/
TimeComplexity - PythonInfo Wiki
http://wiki.python.org/moin/TimeComplexity
0x80 - 911 - 博客园
http://www.cnblogs.com/911/articles/1210786.html
海量数据处理专题(四)——Bit-map
http://www.360doc.com/content/11/0928/09/7656248_151799050.shtml
海量数据处理面试题集锦与Bit-map详解
http://www.uml.org.cn/sjjm/201109061.asp
Seedlet: 資訊檢索及資訊過濾方法概述
http://seedlet.blogspot.com/2007/08/blog-post_07.html
海量數據排序總結 - 軟件開發、測試、管理、設計
http://www.guan8.net/Java/424526.html
python实现bitmap原理 - Xiao_Qiang_的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/xiao_qiang_/article/details/3013448
BitReader – Python module for reading bits from bytes
http://opensourcehacker.com/2010/09/08/bitreader-python-module-for-reading-bits-from-bytes/
Using Python How can I read the bits in a byte? - Stack Overflow
http://stackoverflow.com/questions/2576712/using-python-how-can-i-read-the-bits-in-a-byte
趣味数据结构 – BitMap | 小e的分享 | 独乐乐不如众乐乐
http://www.wikieno.com/2012/03/data-structure-bitmap/
Matrix67: My Blog » Blog Archive » 漫话中文自动分词和语义识别(上):中文分词算法
http://www.matrix67.com/blog/archives/4212
海量数据处理 - 鹰击长空 - C++博客
http://www.cppblog.com/ylfeng/archive/2011/03/14/141788.html 












-end-

Tuesday, May 08, 2012

Daily Bookmarks 20120507

字符串匹配常用算法 - meixr的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/meixr/article/details/6456896
wikipedia note(II): string searching algorithm_RONGEK_百度空间
http://hi.baidu.com/rongekuta/blog/item/1ca7efd64b87fe9ea1ec9c79.html


-end-

Friday, May 04, 2012

Daily Bookmarks 20120504

后缀数组(Suffix Array)——理论和思想_刻录时光_百度空间
http://hi.baidu.com/maydaygmail/blog/item/804c8ecad0d2d11f92457e9b.html
suffix array ( 後綴陣列 ) @ home of benbendog :: 痞客邦 PIXNET ::
http://lettice0913.pixnet.net/blog/post/30179452-suffix-array-(-%E5%BE%8C%E7%B6%B4%E9%99%A3%E5%88%97-)
Suffix Array python
http://waisp.lis.ntu.edu.tw/homepage/docs/html_scripts/SuffixArray.txt
suffix array « demonstrate 的 blog
http://remonstrate.wordpress.com/tag/suffix-array/
search.py - cnlstk - Chinese Natural Language Statistical Toolkit, CNLSTK - Google Project Hosting
http://code.google.com/p/cnlstk/source/browse/trunk/cnlstk/search.py
Algorithms with Python / 接尾辞配列 (suffix array)
http://www.geocities.jp/m_hiroi/light/pyalgo46.html

Brief Introduction to Suffix Array
http://sary.sourceforge.net/docs/suffix-array.html
Life is experience. Experiencing your life.: suffix array: a useful data structure in string manipulation
http://tqhh.blogspot.com/2006/02/suffix-array-useful-data-structure-in.html

后缀数组相关算法(一) | Try Again
http://richardxx.yo2.cn/articles/%e5%90%8e%e7%bc%80%e6%95%b0%e7%bb%84%e7%9b%b8%e5%85%b3%e7%ae%97%e6%b3%95%ef%bc%88%e4%b8%80%ef%bc%89-2.html
后缀数组相关算法(二) | Try Again
http://richardxx.yo2.cn/articles/%e5%90%8e%e7%bc%80%e6%95%b0%e7%bb%84%e7%9b%b8%e5%85%b3%e7%ae%97%e6%b3%95%ef%bc%88%e4%ba%8c%ef%bc%89.html


Binary tree implementation in C question as found in K&R - Stack Overflow
http://stackoverflow.com/questions/6561644/binary-tree-implementation-in-c-question-as-found-in-kr
Binary Search Trees
http://pages.cs.wisc.edu/~hasti/cs367-1/readings/Binary-Search-Trees/index.html
C Program to implement Binary Search Tree Traversal | C Programming : Programs |
http://www.c4learn.com/c-program-to-implement-binary-search-tree-traversal.html

秒杀技 Python JSON Encoder - 沈崴的日志 - 网易博客
http://eishn.blog.163.com/blog/static/65231820070181041204/
Programming Pearls
http://www.cs.bell-labs.com/cm/cs/pearls/index.html

IInterest'Blog
http://www.iinterest.net/2009/02/11/jquery-sequential-list/
jQuery插件—新闻渐变滚动. - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
http://www.cssrain.cn/?p=225
jQuery插件—仿“卓越亚马逊”首页弹出菜单效果. - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
http://www.cssrain.cn/?p=418
jQuery—仿淘宝商品展示效果。 - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
http://www.cssrain.cn/?p=417
ajax标签(tab)内容切换(3demo). - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
http://www.cssrain.cn/?p=1369
仿雅虎首页轮播效果jQuery版【原创】 | web前端,杭州小白的个人博客,小白的个人博客
http://www.xiaobai8.com/Blog/413.html














-end-

Tuesday, May 01, 2012

Daily Bookmarks 20120501

Multiplying a subset of a list of integers together in python - Stack Overflow
http://stackoverflow.com/questions/2165449/multiplying-a-subset-of-a-list-of-integers-together-in-python
Python 标准库 urllib2 的使用细节 | 道可道
http://zhuoqiang.me/a/python-urllib2-usage
Changing Your User Agent in Python | pythonFilter
http://pythonfilter.com/blog/changing-or-spoofing-your-user-agent-python.html
基于后缀搜索的多模式匹配算法——Wu-Manber算法 - Memory的日志 - 网易博客
http://richiememory.blog.163.com/blog/static/116007414201110504330374/

trie 与 AC 自动机 - 遥远的街市
http://blog.henix.info/blog/trie-aho-corasick.html
字典树[Trie]_朱君扬_百度空间
http://hi.baidu.com/%D6%EC%BE%FD%D1%EF/blog/item/011d6608962cc9157aec2c48.html
字典樹(Trie Tree)的應用舉例 « Yee-fan Zhu
http://czb.hk/zyf/index.php/trie-tree-sample/
hdu 1251 统计难题-trie树-字典树 « ☆零☆ || coderling
http://coderling.comuf.com/?p=880
trie树——字典树的优点与实现 « 呢喃的歌声
http://www.singmelody.com/?p=782
数据结构之Trie树 | 董的博客
http://dongxicheng.org/structure/trietree/
使用字典树–Using Tries - godorz…
http://godorz.info/2009/11/using-tries/
字典树–Trie树 - godorz…
http://godorz.info/2009/11/trie/
[字典樹]NOIP模擬題:好感統計 star 解題報告 - Yeefan's Blog
http://yeefanblog.appspot.com/noip-practice-star.html



Java传参是传值还是传址? 转一篇很精练的文章 - 张林林|深蓝(Linlin Zhang,shenlan211314) 的专栏 - 博客频道 - CSDN.NET
http://blog.csdn.net/shenlan211314/article/details/6692421

Python笔记之and or陷阱 - 未知世界 - ITeye技术网站
http://calmness.iteye.com/blog/227020
Python三目运算,and or陷阱_土拨先生_百度空间
http://hi.baidu.com/tuuboo/blog/item/0945d3fc2c4fbe43d6887dbb.html
[python] and or 表达式陷阱一则。 - cpunion - 博客园
http://www.cnblogs.com/cpunion/archive/2005/08/01/204597.html
4.6. The Peculiar Nature of and and or
http://www.diveintopython.net/power_of_introspection/and_or.html








-end-