Thursday, May 31, 2012

Daily Bookmarks 20120531

jQuery: How to count the number of elements which "display" isn't "none"? - Stack Overflow
jQuery count div that has a display:none attribute - Stack Overflow
css - Jquery count number of hidden elements within div - Stack Overflow
javascript - jQuery toggle show/hide elements after certain number of matching elements - Stack Overflow nice
Understanding "event handlers" in JavaScript- onLoad Event handlers

html - Jquery: Adding a class to even an odd List-Elements - Stack Overflow
Checkbox filters with jQuery | Ask the CSS Guy
35 Useful jQuery Filter and Sort Plugins
javascript - jQuery multiple checkbox filters - Stack Overflow

Practical Data Analysis in Python
Text Classification for Sentiment Analysis – Eliminate Low Information Features | StreamHacker
Text Classification for Sentiment Analysis – Naive Bayes Classifier | StreamHacker


Wednesday, May 30, 2012

Daily Bookmarks 20120530

可伸缩架构常用技术--数据切分 - MongoDB,数据切分,水平切分,垂直切分 - Java - ITeye论坛
Instagram 的ID生成策略[翻译] - - ITeye技术网站
High Scalability - High Scalability - How Twitter Stores 250 Million Tweets a Day Using MySQL
Instagram Engineering • Sharding & IDs at Instagram
Django | scot hacker's foobar blog
sql - What would be the right steps for horizontal partitioning in Postgresql? - Stack Overflow
Instagram 的ID生成策略[翻译] - - ITeye技术网站体验很棒的Instagr.am网页客户端|天涯海阁|Web2.0Share
Instagr.am稱Android版某程度上比iPhone更出色 - winandmac.com香港版

短域名/URL Shortening/Base36/Base62 - 打天打鸭 - ITeye技术网站 Picture Previews with Ruby « Jason Neylon's Blog
使用Instagram API showcode to  media id
How to get media-id from shortcode/url. - Instagram API Developers | Google 網上論壇
Instagram Engineering • Storing hundreds of millions of simple key-value pairs in Redis
Use a zipfile store a dict like k-v database — Gist
How to get media-id from shortcode/url. - Instagram API Developers | Google 網上論壇

jQuery price slider filter - Stack Overflow
jquery ui slider double slider filtering - Stack Overflow
Kayak-like filter sliders using jQuery and AJAX pagination in CakePHP | nuts and bolts of cakephp


Tuesday, May 29, 2012

Daily Bookmarks 20120529

How to Protect your Eyes from Continuous Computer Operation
Coding in Dreams [学习笔记] Probabilistic Latent Semantic Analysis (pLSA)
Coding in Dreams  [学习笔记] Expectation-Maximization(EM) 算法
搜索引擎市场竞争加剧,创业公司Helioid推出分类搜索引擎专注学术市场 | 36氪
在Django网站中整合MailChimp服务 - Herock Post
送一本书:小强升职记 - Herock Post
SWODE:个性化门户网站 - Herock Post
GTD要简单高效 - Herock Post
ThinkingRock:最好的GTD软件 - 褪墨


Monday, May 28, 2012

Daily Bookmarks 20120528

Coding Horror: URL Shortening: Hashes In Practice
High Scalability - High Scalability - YouTube Architecture
Generate Random Strings Using PHP | Nine-One-One... Need Code, Help!
理论计算机初步:从hash函数到王小云的MD5破解 « 阅微堂
How Intel is Solving the Problems with Random Number Generation - Tested
石頭閒語:在 C 程式中使用 MD5 library 及其應用 - 樂多日誌
hash - MD5 and SHA-2 collisions in Python - Stack Overflow
git 101 – git的物件模型 | Ricky's murmur...
How do I create unique IDs, like YouTube? | LinkedIn
Flickr: Discussing manufacturing style photo URLs in Flickr API
YouTube URL algorithm? - Stack Overflow
Coding Horror: URL Shortening: Hashes In Practice
Are YouTube codes guaranteed to always be 11 characters? - Web Applications

image md5 filename
Database versus files for Images | dave dash
Resizing Image on upload in Django | dave dash
filenames - Names of image files on a website: hashes or random strings? - Stack Overflow
Duplicate file/image checking with PHP and MD5 - webdevRefinery Forum
storage - Storing a million images in the filesystem - Server Fault
file structure - Convert filenames to their checksum before saving to prevent duplicates. Is is a smart thing to do? - Programmers
hashing - A PHP hash function with a long output length? - Stack Overflow
Coding Horror: Hashtables, Pigeonholes, and Birthdays

store image file name md5 - Google 搜尋


Saturday, May 26, 2012

Daily Bookmarks 20120526

Frighteningly Ambitious Startup Ideas
Helioid » Paul Graham’s “Frighteningly Ambitious” Ideas
Helioid’s Search Engine Provides Category Sorting To Aid Research, Targets Students And Professionals | TechCrunch
Pairwise Document Similarity in Large Collections with MapReduce - 0.028669 seconds
[Java] Pairwise Vector Similarity by Cosine Similarity @ Hadoop 0.20.1 @ 第二十四個夏天後 :: 痞客邦 PIXNET :: 與上一篇相互對照
Pairwise Document Similarity - Google 搜尋
Result Diversity - Google 搜尋
A simple Distributed Hash Table (DHT) « / Reza
Prefix Hash Tree_我思故我在_百度空间

Helioid » Literature Review 2008 – 2009
Helioid » Search Refinement (and Helioid) is the Top Tech Trend Video look
Helioid’s Search Engine Provides Category Sorting To Aid Research, Targets Students And Professionals | TechCrunch

Linux-HA开源软件Heartbeat(安装篇) - 技术成就梦想 - 51CTO技术博客
如何让团队更高效 - 小菜_默 - 51CTO技术博客
工作日志2007年,一 - srsunbing - 51CTO技术博客
LINE Storage: Storing billions of rows in Sharded-Redis and HBase per Month « NAVER Engineers' Blog

term frequency/inverse document frequency (TFIDF) « Infomotions Mini-Musings

Brute force and indexed approaches to pairwise document similarity comparisons with MapReduce - Google 搜尋

Hashed Patricia Trie: Efficient Longest Prefix Matching in Peer-to ...



Thursday, May 24, 2012

Daily Bookmarks 20120524

Ask HN: how can I generate youtube style id? | Hacker News
How do I find the unique video ID in new sharing URL? - Google 網上論壇!topic/youtube/r3zYlqEmTcc
Create short IDs with PHP - Like Youtube or TinyURL
php5 - PHP: Short id like Youtube, with salt - Stack Overflow
Base58 Encode and Decode using PHP with example; base58_encode(), base58_decode() « Dark Launch
Flickr: Discussing manufacturing style photo URLs in Flickr API
YouTube: Does your video ID system really work? | ZDNet
Useful YouTube URL Tricks
what's the youtube video id maximum length? - Stack Overflow
c# - YouTube-like GUID - Stack Overflow
YouTube Co-Founder Explains New Video ID System | Epicenter |
YouTube Dataset
Tim Wu's notes: dropbox storage techniques leaked by dropship
From the README, in case it wasn't obvious:"These utilities make use of the dedu... | Hacker News

石頭閒語:在 C 程式中使用 MD5 library 及其應用 - 樂多日誌

Yet Another Chris - C# and .NET - Latest posts - Friendly Unique Id Generation Part 1

YouTube/ TinyURL style unique id strings?
From the README, in case it wasn't obvious:"These utilities make use of the dedu... | Hacker News
MD5 hashing 4-byte and 8-byte keys into 16-byte values; what's the chance of a collision? - Stack Overflow
Flickr: Discussing manufacturing style photo URLs in Flickr API
Create short IDs with PHP - Like Youtube or TinyURL

Counting YouTube Videos via Random Prefix Sampling - 翻譯這個網頁
檔案類型: PDF/Adobe Acrobat - 快速檢視


Monday, May 21, 2012

Daily Bookmarks 20120520

Code: Flickr Developer Blog » Ticket Servers: Distributed Unique Primary Keys on the Cheap
Flickr Authorization
What is the design and architecture of's short URLs? - Quora

sullof/shardjs design short url Good~~~
Django snippets: A custom URL shortening view, for use with rev=canonical
URL Shortener Web Application Using Django
Create your own simple short URL generator website « Kaos Coder
base62编码在微博短链接中的应用 | CodeTick | 关注各类编程技巧
rev=canonical bookmarklet and designing shorter URLs
反新浪的生成短网址 « 一路走来……
Django snippets: A custom URL shortening view, for use with rev=canonical
vikasing: A Simple URL Shortening Algorithm in JAVA
How to make unique short URL with Python? - Stack Overflow
language agnostic - Tinyurl-style unique code: potential algorithm to prevent collisions - Stack Overflow
algorithm - Map incrementing integer range to six-digit base 26 max, but unpredictably - Stack Overflow
What is the design and architecture of's short URLs? - Quora


Thursday, May 17, 2012

Daily Bookmarks 20120516

Fuzzy matching/chunking algorithm - Stack Overflow
9.2.7 Numeric Comparison with Percentage Tolerance  'FieldComparatorNumericPerc'
]  n-Gram/2L-approximation: a two-level n-gram inverted index ...
CiteSeerX — A Practical q-Gram Index for Text Retrieval Allowing Errors
Go together. - Wagn nice app RoR
k-gram indexes for wildcard queries
Group-average agglomerative clustering
轮排主题索引(Permuterm Subject Index)
Permuterm index for cs707_011712
Building A Python-Based Search Engine — PyCon2012 Schedule & Notes 1.0 documentation

bi gram index python - Google 搜尋
Indexing Text in Python
Faceting — Haystack 2.0.0-beta documentation
Sites Using Haystack — Haystack 2.0.0-beta documentation
term frequency/inverse document frequency (TFIDF) « Infomotions Mini-Musings
Automatic metadata generation « Infomotions Mini-Musings

谈谈BM25评分 - summerbell - ITeye技术网站
BM25算法浅析 - iPie : 思维碎片
Project2--Lucene的Ranking算法修改:BM25算法 - wbia2010lkl的专栏 - 博客频道 - CSDN.NET
利用 Heritrix 构建特定站点爬虫
搜索引擎内容相关性 | 崔永秀

Patent US7644076 - Clustering strings using N-grams - Google Patents


Sunday, May 13, 2012

Daily Bookmarks 20120513
How to map the most "similar" strings from one list to another in python? - Stack Overflow
string - Given two python lists of same length. How to return the best matches of similar values? - Stack Overflow
efficient list mapping in python - Stack Overflow
Web Intelligence and Data Mining Laboratory: A New Suffix Tree Similarity Measure for Document Clustering
Calculating similarity between text strings in Python | Simpliplant Computing string similarity with TF-IDF and Python
The Digital Standard: Why Fuzzy Hashing is Really Cool

科学网—又好又快的检索:Fast Similarity Search - 王靖琰的博文
Mahout学习——K-Means Clustering - Leo Zhang - 博客园
漫谈 Clustering (1): k-means « Free Mind
Time complexity of HAC
similarity clustering pattern match - Google 搜尋


Daily Bookmarks 20120512

Indexing and searching N-grams — Whoosh 2.4.0 documentation
Lucid Imagination » Auto-Suggest From Popular Queries Using EdgeNGrams
Quora’s Technology Examined | Big Fast Blog
crr » Fast computation of average Levenshtein distances in python
A Simple N-gram Calculator: pyngram « « The Sunjay Times The Sunjay Times
SimString - A fast and simple algorithm for approximate string matching/retrieval
document - Simple implementation of N-Gram, tf-idf and Cosine similarity in Python - Stack Overflow
N-Gram generation with counting in Python | thirumal's blog
The structure of approximate groups « What’s new
Linear approximate groups « What’s new
Private Medical Record Linkage with Approximate Matching Good !!!!!
rongorongo: Approximate Matches
algorithm - most efficient way to group search results by string similarity - Stack Overflow
Approximate String Matching Engine (Fuzzy Matching Algorithm and Source Code) | Ask A Data Miner - 75,000+ Members

Simple Universally Unique ID (UUID or GUID) « Python recipes « ActiveState Code
Medical record linkage in health information systems by approximate string matching and clustering
Medical record linkage in health information systems by approximate string matching and clustering
php - Why is MD5'ing a UUID not a good idea? - Stack Overflow

On Using Two-Phase Filtering in Indexed Approximate String ... - 翻譯這個網頁
檔案類型: PDF/Adobe Acrobat - 快速檢視


Friday, May 11, 2012

Daily Bookmarks 20120511

Filtrify – Tag Filtering jQuery Plugin | Web Resource Source
GBin1推荐:一个jQuery的超级魔术布局插件 - Isotope
使用jQuery插件filtrify实现的超酷动态标签分类摩托车新款展示 - gbin1 - 博客园

Python: a Technique to Append String in a Loop
python - Search a list of strings for any sub-string from another list - Stack Overflow
[Python] string1 = "abcdefghijklmnopqrstuvwxyz" string2 = "111ijklmno1111klmnopqrstuvwxy -
Can someone help me find the longest common substring using python code? - Yahoo! Answers
python - Multiple Sequence Alignment (Longest Common Subsequence)? - Stack Overflow
后缀树【Suffix Tree】 - TsengYuen的专栏 - 博客频道 - CSDN.NET
hash - Working with suffix trees in python - Stack Overflow
1.1.3 Suffix Trees and Arrays
bramcohen: Longest Common Substring
suffix tree « demonstrate 的 blog Good site
Suffix Trees
Suffix Tree—后缀树 - 忽若流星 - C++博客


Thursday, May 10, 2012

Daily Bookmarks 20120510

Git Book 中文版 - 交互式添加
Kayak-like filter sliders using jQuery and AJAX pagination in CakePHP | nuts and bolts of cakephp
Magento - shop by price - price range ? - HTML, XHTML, CSS, Design Questions - eCommerce Software for Growth
jQuery price filter? - LemonStand Forum


Wednesday, May 09, 2012

Daily Bookmarks 20120509

javascript - jQuery Lazy Loading - Problem with display:none - Stack Overflow
改造,改造一下:jQuery lazyLoad | 木木木木木
Floatutorial: Step by step CSS float tutorial
CSS 浮动


Daily Bookmarks 20120508

18.2. 使用 timeit 模块
timeit – Time the execution of small bits of Python code. - Python Module of the Week
TimeComplexity - PythonInfo Wiki
0x80 - 911 - 博客园
Seedlet: 資訊檢索及資訊過濾方法概述
海量數據排序總結 - 軟件開發、測試、管理、設計
python实现bitmap原理 - Xiao_Qiang_的专栏 - 博客频道 - CSDN.NET
BitReader – Python module for reading bits from bytes
Using Python How can I read the bits in a byte? - Stack Overflow
趣味数据结构 – BitMap | 小e的分享 | 独乐乐不如众乐乐
Matrix67: My Blog » Blog Archive » 漫话中文自动分词和语义识别(上):中文分词算法
海量数据处理 - 鹰击长空 - C++博客 


Tuesday, May 08, 2012

Daily Bookmarks 20120507

字符串匹配常用算法 - meixr的专栏 - 博客频道 - CSDN.NET
wikipedia note(II): string searching algorithm_RONGEK_百度空间


Friday, May 04, 2012

Daily Bookmarks 20120504

后缀数组(Suffix Array)——理论和思想_刻录时光_百度空间
suffix array ( 後綴陣列 ) @ home of benbendog :: 痞客邦 PIXNET ::
Suffix Array python
suffix array « demonstrate 的 blog - cnlstk - Chinese Natural Language Statistical Toolkit, CNLSTK - Google Project Hosting
Algorithms with Python / 接尾辞配列 (suffix array)

Brief Introduction to Suffix Array
Life is experience. Experiencing your life.: suffix array: a useful data structure in string manipulation

后缀数组相关算法(一) | Try Again
后缀数组相关算法(二) | Try Again

Binary tree implementation in C question as found in K&R - Stack Overflow
Binary Search Trees
C Program to implement Binary Search Tree Traversal | C Programming : Programs |

秒杀技 Python JSON Encoder - 沈崴的日志 - 网易博客
Programming Pearls

jQuery插件—新闻渐变滚动. - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
jQuery插件—仿“卓越亚马逊”首页弹出菜单效果. - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
jQuery—仿淘宝商品展示效果。 - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
ajax标签(tab)内容切换(3demo). - CssRain-前端技术 - 读者的进步速度远大于博客的进步速度。
仿雅虎首页轮播效果jQuery版【原创】 | web前端,杭州小白的个人博客,小白的个人博客


Tuesday, May 01, 2012

Daily Bookmarks 20120501

Multiplying a subset of a list of integers together in python - Stack Overflow
Python 标准库 urllib2 的使用细节 | 道可道
Changing Your User Agent in Python | pythonFilter
基于后缀搜索的多模式匹配算法——Wu-Manber算法 - Memory的日志 - 网易博客

trie 与 AC 自动机 - 遥远的街市
字典樹(Trie Tree)的應用舉例 « Yee-fan Zhu
hdu 1251 统计难题-trie树-字典树 « ☆零☆ || coderling
trie树——字典树的优点与实现 « 呢喃的歌声
数据结构之Trie树 | 董的博客
使用字典树–Using Tries - godorz…
字典树–Trie树 - godorz…
[字典樹]NOIP模擬題:好感統計 star 解題報告 - Yeefan's Blog

Java传参是传值还是传址? 转一篇很精练的文章 - 张林林|深蓝(Linlin Zhang,shenlan211314) 的专栏 - 博客频道 - CSDN.NET

Python笔记之and or陷阱 - 未知世界 - ITeye技术网站
Python三目运算,and or陷阱_土拨先生_百度空间
[python] and or 表达式陷阱一则。 - cpunion - 博客园
4.6. The Peculiar Nature of and and or