Hacker News Full Feed

Feed URL
http://feeds.feedburner.com/HackerNewsFullFeed
Web URL
http://lazyreadr.appspot.com/feeds/public.rss?id=167356
Average words per item
927
Number of subscribers
15
Recent processing errors
None

Google can index AJAX apps via HTML #Fragments

January 29, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Full Specification - Making AJAX Applications Crawlable

This document describes an agreement between web servers and search engine crawlers that allows for dynamically created content to be visible to crawlers. Google currently supports this agreement. The hope is that other search engines will also adopt this proposal.

  • Web application: In this document, a web application is an AJAX-enabled, interactive web application.
  • State: While traditional static web sites consist of many pages, a more appropriate term for AJAX applications is "state". An application consists of a number of states, where each state constitutes a specific user experience or a response to user input. Examples of states: For a mail application, states could be base state, inbox, compose, etc. For a chess application, states could be base state, start new game, but also current state x of the chessboard, including information about past moves, whose player's turn it is, and so forth. In an AJAX application, a state often corresponds to a URL with a hash fragment.
  • Hash fragments: Traditionally, hash fragments (that is, everything after # in the URL) have been used to indicate one portion of a static HTML document. By contrast, AJAX applications often use hash fragments in another function, namely to indicate state. For example, when a user navigates to the URL http://www.example.com/ajax.html#key1=value1&key2=value2, the AJAX application will parse the hash fragment and move the application to the "key1=value1&key2=value2" state. This is similar in spirit to moving to a portion of a static document, that is, the traditional use of hash fragments. History (the back button) in AJAX applications is generally handled with these hash fragments as well. Why are hash fragments used in this way? While the same effect could often be achieved with query parameters (for example, ?key1=value1&key2=value2), hash fragments have the advantage that in and of themselves, they do not incur an HTTP request and thus no round-trip from the browser to the server and back. In other words, when navigating from www.example.com/ajax.html to www.example.com/ajax.html#key1=value1&key2=value2, the web application moves to the state key1=value1&key2=value2 without a full page refresh. As such, hash fragments are an important tool in making AJAX applications fast and responsive. Importantly, however, hash fragments are not part of HTTP requests (and as a result they are not sent to the server), which is why our approach must handle them in a new way. See RFC 3986 for more details on hash fragments.
  • Query parameters: Query parameters (for example, ?s=value in the URL) are used by web sites and applications to post to or obtain information from the server. They incur a server round-trip and full page reload. In other words, navigating from www.example.com to www.example.com?s=value is handled by an HTTP request to the server and a full page reload. See RFC 3986 for more details. Query parameters are routinely used in AJAX applications as well.
  • HTML snapshot: An HTML snapshot is the serialization of the DOM the browser will produce when loading the page, including executing any JavaScript that is needed to get the intial page.
  • Pretty URL: Any URL containing a hash fragment beginning with !, for example, www.example.com?myquery#!key1=value1&key2=value2
  • Ugly URL: Any URL containing a query parameter with the key _escaped_fragment_, for example, www.example.com?myquery&_escaped_fragment_=key1=value1%26key2=value2.

A bidirectional mapping exists between pretty and ugly URLs:

?_escaped_fragment_=key1=value1%26key2=value2: used for crawling only, indicates an indexable AJAX app state

#!key1=value1&key2=value2: used for normal (browser) web site interaction

Mapping from #! to _escaped_fragment_ format

Each URL that contains a hash fragment beginning with the exclamation mark is considered a #! URL. Note that any URL may contain at most one hash fragment. Each pretty ( #!) URL has a corresponding ugly ( _escaped_fragment_) URL, which is derived with the following steps:

  1. The hash fragment becomes part of the query parameters.
  2. The hash fragment is indicated in the query parameters by preceding it with _escaped_fragment_=
  3. Some characters are escaped when the hash fragment becomes part of the query parameters. These characters are listed below.
  4. All other parts of the URL (host, port, path, existing query parameters, and so on) remain unchanged.

Mapping from _escaped_fragment_ format to #! format

Any URL whose query parameters contain the special token _escaped_fragment_ as the last query parameter is considered an _escaped_fragment_ URL. Further, there must only be one _escaped_fragment_ in the URL, and it must be the last query parameter. The corresponding #! URL can be derived with the following steps:

  1. Remove from the URL all tokens beginning with _escaped_fragment_= (Note especially that the = must be removed as well).
  2. Remove from the URL the trailing ? or & (depending on whether the URL had query parameters other than _escaped_fragment_).
  3. Add to the URL the tokens #!.
  4. Add to the URL all tokens after _escaped_fragment_= after unescaping them.

Note: As is explained below, there is a special syntax for pages without hash fragments, but that still contain dynamic Ajax content. For those pages, to map from the _escaped_fragment_ URL to the original URL, omit steps 3 and 4 above.

Escaping characters in the bidirectional mapping

The following characters will be escaped when moving the hash fragment string to the query parameters of the URL, and must be unescaped by the web server to obtain the original URL:

  • %00..20
  • %23
  • %25..26
  • %2B
  • %7F..FF

Control characters (0x00..1F and 0x7F) should be avoided. Non-ASCII text will be converted to UTF-8 before escaping.

Transformation of URL

  1. URLs of the format domain[:port]/path#!hashfragment, for example, www.example.com#!key1=value1&key2=value2 are temporarily transformed into domain[:port]/path?_escaped_fragment_=hashfragment, such as www.example.com?_escaped_fragment_=key1=value1%26key2=value2. In other words, a hash fragment beginning with an exclamation mark ('!') is turned into a query parameter. We refer to the former as "pretty URLs" and to the latter as "ugly URLs".
  2. URLs of the format domain[:port]/path?queryparams#!hashfragment (for example, www.example.com?user=userid#!key1=value1&key2=value2) are temporarily transformed into domain[:port]/path?queryparams&_escaped_fragment_=hashfragment (for the above example, www.example.com?user=userid&_escaped_fragment_=key1=value1%26key2=value2). In other words, a hash fragment beginning with an exclamation mark ('!') is made part of the existing query parameters by adding a query parameter with the key "_escaped_fragment_" and the value of the hash fragment without the "!". As in this case the URL already contains query parameters, the new query parameter is delimited from the existing ones with the standard delimiter '&'. We refer to the former #! as "pretty URLs" and to the latter _escaped_fragment_ URLs as "ugly URLs".
  3. Some characters are escaped when making a hash fragment part of the query parameters. See the previous section for more information.
  4. If a page has no hash fragments, but contains <meta name="fragment" content="!" in the <head of the HTML, the crawler will transform the URL of this page from domain[:port]/path to domain[:port]/path?_escaped_fragment= (or domain[:port]/path?queryparams to domain[:port]/path?queryparams&_escaped_fragment_= and will then access the transformed URL. For example, if www.example.com contains <meta name="fragment" content="!" in the head, the crawler will transform this URL into www.example.com?_escaped_fragment_= and fetch www.example.com?_escaped_fragment_= from the web server.

Request

The crawler agrees to request from the server ugly URLs of the format:

  • domain[:port]/path?_escaped_fragment_=hashfragment
  • domain[:port]/path?queryparams&_escaped_fragment_=hashfragment
  • domain[:port]/path?_escaped_fragment_=
  • domain[:port]/path?queryparams&_escaped_fragment_=

Search result

The search engine agrees to display in the search results the corresponding pretty URLs:

  • domain[:port]/path#!hashfragment
  • domain[:port]/path?queryparams#!hashfragment
  • domain[:port]/path
  • domain[:port]/path?queryparams

Opting into the AJAX crawling scheme

The application must opt into the AJAX crawling scheme to notify the crawler to request ugly URLs. An application can opt in with either or both of the following:

  • Use #! in your site's hash fragments.
  • Add a trigger to the head of the HTML of a page without a hash fragment (for example, your home page):
    <meta name="fragment" content="!"
    

Once the scheme is implemented, AJAX URLs containing hash fragments with #! are eligible to be crawled and indexed by the search engine.

Transformation of URL

In response to a request of a URL that contains _escaped_fragment_ (which should always be a request from a crawler), the server agrees to return an HTML snapshot of the corresponding pretty #! URL. See above for the mapping between _escaped_fragment_ (ugly) URLs and #! (pretty) URLs.

Serving the HTML snapshot corresponding to the dynamic page

In response to an _escaped_fragment_ URL, the origin server agrees to return to the crawler an HTML snapshot of the corresponding #! URL. The HTML snapshot must contain the same content as the dynamically created page.

HTML snapshots can be obtained in an offline process or dynamically in response to a crawler request. For a guide on producing an HTML snapshot, see the HTML snapshot section.

It may be impossible or undesirable for some pages to have hash fragments in their URLs. For this reason, this scheme has a special provision for such pages: in order to indicate that a page without a hash fragment should be crawled again in _escaped_fragment_ form, it is possible to embed a special meta tag into the head of its HTML. The syntax for this meta tag is as follows:

<meta name="fragment" content="!"

The following important restrictions apply:

  1. The meta tag may only appear in pages without hash fragments.
  2. Only "!" may appear in the content field.
  3. The meta tag must appear in the head of the document.

The crawler treats this meta tag as follows: If the page www.example.com contains the meta tag in its head, the crawler will retrieve the URL www.example.com?_escaped_fragment_=. It will index the the content of the page www.example.com and will display www.example.com in search results.

As noted above, the mapping from the _escaped_fragment_ to the #! syntax is slightly different in this case: to retrieve the original URL, the web server instead simply removes the tokens _escaped_fragment_= (note the =) from the URL. In other words, you want to end up with the URL www.example.com instead of www.example.com#!.

Warning: Should the content for www.example.com?_escaped_fragment_= return a 404 code, no content will be indexed for www.example.com! So, be careful if you add this meta tag to your page and make sure an HTML snapshot is returned.

In order to crawl your site's URLs, a crawler must be able to find them. Here are two common ways to accomplish this:

  1. Hyperlinks: An HTML page or an HTML snapshot can contain hyperlinks to pretty URLs, that is, URLs containing #! hash fragments. Note: The crawler will not follow links extracted from HTML that contain _escaped_fragment_.
  2. Sitemap: Pretty URLs may be listed in Sitemaps. For more information on Sitemaps, please see www.sitemaps.org.

Current practices will still be supported. Hijax remains a valid solution, as we describe here. Giving the crawler access to static content remains the main goal.

A few web pages already use exclamation marks as the first character in a hash fragment. Because hash fragments are not a part of the URL that are sent to a server, such URLs have never been crawled. In other words, such URLs are not currently in the search index.

Under the new scheme, they can be crawled. In other words, a crawler will map each #! URL to its corresponding _escaped_fragment_ URL and request this URL from the web server. Because the site uses the pretty URL syntax (that is, #! hash fragments), the crawler will assume that the site has opted into the AJAX crawling scheme. This can cause problems, because the crawler will not get any meaningful content for these URLs if the web server does not return an HTML snapshot.

There are two options:

  1. The site adopts the AJAX crawling scheme and returns HTML snapshots.
  2. If this is not desired, it is possible to opt out out of the scheme by adding a directive to the robots.txt file:

    Disallow: /*_escaped_fragment_

Mystery of the Email J Finally Solved

January 29, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Mystery of the Email J Finally Solved :: Chris Jean

For years, I’ve received emails that have capital ‘J’s thrown in at the end of seemingly random sentences. It’s never been a big deal, but it has always baffled me.

For example, I just received an email that contained the following bit:

Ha, I didn’t mean to reply to everyone! I am glad I didn’t say anything bad. J Thank you for…

What in the world is that J doing there?

I’ve speculated all this time of what people could mean. At first, I thought that it was a short for “I’m joking,” but it would only sometimes make sense in that context. The other odd thing is that the J is always rendered in a different font than the rest of the text.

I would claim that this mystery has kept me up some nights, but that would be just a little bit more than a standard exaggeration.

Finally, today I stumbled upon an answer. Not only does it explain why the J would appear in contexts that would both be joking and not joking, but it also explains the odd font deal. If I had been observant or caring enough, I might have noticed another pattern, the odd appearance of J only happened when the sender sent the message from Outlook.

The simple answer to the crazy mystery of the ‘J’s is that Microsoft uses a Wingding to render a smily in Outlook. The Wingding happyface happens to be at the same position as a J in the standard ASCII sets. So, on all clients other than Outlook, it renders as an out-of-place looking J.

Yay! Another example of Microsoft not following standards. Why use the near universal understanding that ‘:’ followed by ‘)’ is a code for happyface and can be interpreted by software if desired when you can just use one of your proprietary fonts instead?

Don’t get me wrong, I’m not an anti-Microsoft zealot, I just wish that the software from Microsoft played nicer with the other children in the playground. Don’t even get me started on “smart” quotes and how many companies, not just Microsoft, have dropped the ball on that one creating an amazing nightmare for humble coders like me that have to deal with their mess.

Facebook awarded over $360 million damages against spammer

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

How to delete from Okasaki's red-black trees

Red-black trees

Red-black trees are self-balancing binary search trees in which every node has one of two colors: red or black.

Red-black trees obey two additional invariants:

  1. Any path from the root to a leaf has the same number of black nodes.
  2. All red nodes have two black children.

Leaf nodes, which do not carry values, are considered black for the purposes of both height and coloring.

Any tree that obeys these conditions ensures that the longest root-to-leaf path is no more than double the shortest root-to-leaf path. These constraints on path length guarantee fast, logarithmic reads, insertions and deletes.

Examples

The following is a valid red-black tree:

Both of the following are invalid red-black representations of the set {1,2,3}:

The following are valid representations of the set {1,2,3}:

Delete: A high-level summary

There are many easy cases in red-black deletion--cases where the change is local and doesn't require rebalancing or (much) recoloring.

The only hard case ends up being the removal of a black node with no children, since it alters the height of the tree.

Fortunately, it's easy to break apart this case into three phases, each of which is conceptually simple and straightforward to implement.

The trick is to add two temporary colors: double black and negative black.

The three phases are then removing, bubbling and balancing:

  1. By adding the color double-black, the hard case reduces to changing the target node into a double-black leaf. A double-black node counts twice for black height, which allows the black-height invariant to be preserved.
  2. Bubbling tries to eliminate the double black just created by a removal. Sometimes, it's possible to eliminate a double-black by recoloring its parent and its sibling. If that's not possible, then the double-black gets "bubbled up" to its parent. To do so, it might be necessary to recolor the double black's (red) sibling to negative black.
  3. Balancing eliminates double blacks and negative blacks at the same time. Okasaki's red-black algorithms use a rebalancing procedure. It's possible to generalize this rebalancing procedure with two new cases so that it can reliably eliminate double blacks and negative blacks.

Red-black trees in Racket

My implementation of red-black trees is actually an implementation of red-black maps:

; Struct definition for sorted-map:
(define-struct sorted-map (compare))

;  Internal nodes:
(define-struct (T sorted-map)
  (color left key value right))

;  Leaf nodes:
(define-struct (L sorted-map) ())

;  Double-black leaf nodes:
(define-struct (BBL sorted-map) ())

Every sorted-map has a comparison function on keys. Each internal node ( T) has a color, a left sub-tree, a key, a value and a right sub-tree. There are also black leaf nodes ( L) and double-black leaf nodes ( LBB).

The implementation contains four colors total--double black ( 'BB), black ( 'B), red ( 'R) and negative black ( '-B):

To make the expression of routines and sub-routines compact and readable, I used Racket's fully extensible pattern-matching systems:

; Matches internal nodes:
(define-match-expander T!
  (syntax-rules ()
    [(_)            (T _ _ _ _ _ _)]
    [(_ l r)        (T _ _ l _ _ r)]
    [(_ c l r)      (T _ c l _ _ r)]
    [(_ l k v r)    (T _ _ l k v r)]
    [(_ c l k v r)  (T _ c l k v r)]))

; Matches leaf nodes: 
(define-match-expander L!
  (syntax-rules ()
    [(_)     (L _)]))

; Matches black nodes (leaf or internal):
(define-match-expander B
  (syntax-rules ()
    [(_)              (or (T _ 'B _ _ _ _)
                          (L _))]
    [(_ cmp)          (or (T cmp 'B _ _ _ _)
                          (L cmp))]
    [(_ l r)          (T _ 'B l _ _ r)]
    [(_ l k v r)      (T _ 'B l k v r)]
    [(_ cmp l k v r)  (T cmp 'B l k v r)]))

; Matches red nodes:
(define-match-expander R
  (syntax-rules ()
    [(_)              (T _ 'R _ _ _ _)]
    [(_ cmp)          (T cmp 'R _ _ _ _)]
    [(_ l r)          (T _ 'R l _ _ r)]
    [(_ l k v r)      (T _ 'R l k v r)]
    [(_ cmp l k v r)  (T cmp 'R l k v r)]))

; Matches negative black nodes:
(define-match-expander -B
  (syntax-rules ()
    [(_)                (T _ '-B _ _ _ _)]
    [(_ cmp)            (T cmp '-B _ _ _ _)]
    [(_ l k v r)        (T _ '-B l k v r)]
    [(_ cmp l k v r)    (T cmp '-B l k v r)]))

; Matches double-black nodes (leaf or internal):
(define-match-expander BB
  (syntax-rules ()
    [(_)              (or (T _ 'BB _ _ _ _)
                          (BBL _))]
    [(_ cmp)          (or (T cmp 'BB _ _ _ _)
                          (BBL _))]
    [(_ l k v r)      (T _ 'BB l k v r)]
    [(_ cmp l k v r)  (T cmp 'BB l k v r)]))

To further condense cases, the implementation also uses color arithmetic. For instance, adding a black to a black yields a double-black. Subtracting a black from a black yields a red. Subtracting a black from a red yields a negative black. In Racket:

(define/match (black+1 color-or-node)
  [(T cmp c l k v r)  (T cmp (black+1 c) l k v r)]
  [(L cmp)            (BBL cmp)]
  ['-B 'R]
  ['R  'B]
  ['B  'BB])

(define/match (black-1 color-or-node)
  [(T cmp c l k v r)  (T cmp (black-1 c) l k v r)]
  [(BBL cmp)          (L cmp)]
  ['R   '-B]
  ['B    'R]
  ['BB   'B])

Diagrammatically:

Red-black deletion in detail

In Racket, the skeleton for red-black deletion is:

(define (sorted-map-delete node key)
  
  ; The comparison function on keys:
  (define cmp (sorted-map-compare node))
  
  ; Finds and deletes the node with the right key:
  (define (del node) ...)

  ; Removes this node; it might
  ; leave behind a double-black node:
  (define (remove node) ...)
 
  ; Kills a double-black, or moves it upward;
  ; it might leave behind a negative black:
  (define (bubble c l k v r) ...)
  
  ; Removes the max (rightmost) node in a tree;
  ; may leave behind a double-black at the root:
  (define (remove-max node) ...)
   
  ; Delete the key, and color the new root black:
  (blacken (del node)))

Finding the target key

The procedure del searches through the tree until it finds the node to delete, and then it calls remove:

  (define/match (del node)
    [(T! c l k v r)
     ; =>
     (switch-compare (cmp key k)
       [<   (bubble c (del l) k v r)]
       [=   (remove node)]
       [>   (bubble c l k v (del r))])]
    
    [else     node])

( define/match and switch-compare are macros to make the code more compact and readable.)

Because deletion could produce a double-black node, the procedure bubble gets invoked to move it upward.

Removal

The remove procedure breaks removal into several cases:

The cases group according to how many children the target node has. If the target node has two sub-trees, remove reduces it to the case where there is at most one sub-tree.

It's easy to turn removal of a node with two children into removal of a node with at most one child: find the maximum (rightmost) element in its left (less-than) sub-tree; remove that node instead, and place its value into the node to be removed.

For example, removing the blue node (with two children) reduces to removing the green node (with one) and then overwriting the blue with the green:

If the target node has leaves for children, removal is straightforward:

A red node becomes a leaf node; a black node becomes a double-black leaf.

If the target node has one child, there is only one possible case. (I originally thought there were three, but Wei Hu pointed out that the other two violate red-black constraints, and cannot happen.)

That single case is where the target node is black and its child is red.

The child becomes the parent, and it is made black:

The corresponding Racket code for these cases is:

  (define/match (remove node)
    ; Leaves are easy to kill:
    [(R (L!) (L!))     (L cmp)]
    [(B (L!) (L!))     (BBL cmp)]
    
    ; Killing a node with one child:
    [(or (B (R l k v r) (L!))
         (B (L!) (R l k v r)))
     ; =>
     (T cmp 'B l k v r)]
    
    ; Killing a node with two sub-trees:
    [(T! c (and l (T!)) (and r (T!)))
     ; =>
     (match-let (((cons k v) (sorted-map-max l))
                 (l*         (remove-max l)))
       (bubble c l* k v r))])

For the record, these were the two cases that got eliminated:

Bubbling

The bubble procedure moves double-blacks from children to parents, or eliminates them entirely if possible.

There are six possible cases in which a double-black child appears:

In every case, the action necessary to move the double black upward is the same; a black is substracted from the children, and added to the parent:

This operation leads to the corresponding trees:

A dotted line indicates the need for a rebalancing operation, because of the possible introduction of a red/red or negative black/red parent/child relationship.

Because the action is the same in every case, the code for bubble is short:

  (define (bubble c l k v r)
    (cond
      [(or (double-black? l) (double-black? r))
       ; =>
       (balance cmp (black+1 c) (black-1 l) k v (black-1 r))]
      
      [else (T cmp c l k v r)]))

Generalizing rebalancing

Okasaki's balancing operation takes a tree with balanced black-height but improper coloring and performs a tree rotation and a recoloring.

The original procedure focused on fixing red/red violations. The new procedure has to fix negative-black/red violations, and it also has to opportunistically eliminate double-blacks.

The original procedure eliminated all of the red/red violations in these trees:

by turning them into this tree:

The extended procedure can handle a root that is double-black:

by turning them all into this tree:

If a negative black appears as the result of a bubbling, as in:

then a slightly deeper transformation is necessary:

Once again, the dotted lines indicate the possible introduction of a red/red violation that could need rebalancing.

So, the balance procedure is recursive, but it won't call itself more than once.

There is also the symmetric case for this last operation, and these two new cases take care of all possible negative blacks.

In Racket, only two new cases are added to the balancing procedure:

; Turns a black-balanced tree with invalid colors
; into a black-balanced tree with valid colors:
(define (balance-node node)
  (define cmp (sorted-map-compare node))
  (match node

    ; Classic balance, but also catches double blacks:
    [(or (T! (or 'B 'BB) (R (R a xk xv b) yk yv c) zk zv d)
         (T! (or 'B 'BB) (R a xk xv (R b yk yv c)) zk zv d)
         (T! (or 'B 'BB) a xk xv (R (R b yk yv c) zk zv d))
         (T! (or 'B 'BB) a xk xv (R b yk yv (R c zk zv d))))
     ; =>
     (T cmp (black-1 (T-color node)) 
            (T cmp 'B a xk xv b)
            yk yv 
            (T cmp 'B c zk zv d))]

    ; Two new cases to eliminate negative blacks:
    [(BB a xk xv (-B (B b yk yv c) zk zv (and d (B))))
     ; =>
     (T cmp 'B (T cmp 'B a xk xv b)
               yk yv
               (balance cmp 'B c zk zv (redden d)))]
    
    [(BB (-B (and a (B)) xk xv (B b yk yv c)) zk zv d)
     ; =>
     (T cmp 'B (balance cmp 'B (redden a) xk xv b)
               yk yv
               (T cmp 'B c zk zv d))]
    
    [else     node]))
  
(define (balance cmp c l k v r)
  (balance-node (T cmp c l k v r)))

And, that's it.

Code

The code is available as a Racket module. My testing script is also available:

The testing system uses a mixture of exhaustive testing (on all trees with up to eight elements) and randomized testing (on much larger trees).

I'm confident it flushed the bugs out of my implementation. Please let me know if you find a test case that breaks it.

More resources

[

article index

] [

] [

@mattmight

] [

rss

]

How GitHub deals with DMCA Takedown

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Help.GitHub - DMCA Takedown

GitHub, Inc. (“GitHub”) supports the protection of intellectual property and asks the users of the website GitHub.com to do the same. It is the policy of GitHub to respond to all notices of alleged copyright infringement.

Notice is specifically given that GitHub is not responsible for the content on other websites that any user may find or access when using GitHub.com. This notice describes the information that should be provided in notices alleging copyright infringement found specifically on GitHub.com, and this notice is designed to make alleged infringement notices to GitHub as straightforward as possible and, at the same time, minimize the number of notices that GitHub receives that are spurious or difficult to verify. The form of notice set forth below is consistent with the form suggested by the United States Digital Millennium Copyright Act (“DMCA”) which may be found at the U.S. Copyright official website: http://www.copyright.gov.

It is the policy of GitHub, in appropriate circumstances and in its sole discretion, to disable and/or terminate the accounts of users of GitHub.com who may infringe upon the copyrights or other intellectual property rights of GitHub and/or others.

Our response to a notice of alleged copyright infringement may result in removing or disabling access to material claimed to be a copyright infringement and/or termination of the subscriber. If GitHub removes or disables access in response to such a notice, we will make a reasonable effort to contact the responsible party of our decision so that they may make an appropriate response.

To file a notice of an alleged copyright infringement with us, you are required to provide a written communication only by email or postal mail. Notice is also given that you may be liable for damages (including costs and attorney fees) if you materially misrepresent that a product or activity is infringing upon your copyright.

To expedite our handling of your notice, please use the following format or refer to Section 512(c)(3) of the Copyright Act.

  1. Identify in sufficient detail the copyrighted work you believe has been infringed upon. This includes identification of the web page or specific posts, as opposed to entire sites. Posts must be referenced by either the dates in which they appear or by the permalink of the post. Include the URL to the concerned material infringing your copyright (URL of a website or URL to a post, with title, date, name of the emitter), or link to initial post with sufficient data to find it.

  2. Identify the material that you allege is infringing upon the copyrighted work listed in Item #1 above. Include the name of the concerned litigious material (all images or posts if relevant) with its complete reference.

  3. Provide information on which GitHub may contact you, including your email address, name, telephone number and physical address.

  4. Provide the address, if available, to allow GitHub to notify the owner/administrator of the allegedly infringing webpage or other content, including email address.

  5. Also include a statement of the following: “I have a good faith belief that use of the copyrighted materials described above on the infringing web pages is not authorized by the copyright owner, or its agent, or the law.”

  6. Also include the following statement: “I swear, under penalty of perjury, that the information in this notification is accurate and that I am the copyright owner, or am authorized to act on behalf of the owner, of an exclusive right that is allegedly infringed.”

  7. Your physical or electronic signature

Send the written notification via regular postal mail to the following:

GitHub Inc
Attn: DMCA takedown
589 Howard St., 4th Floor
San Francisco, CA. 94105

or email notification to copyright@github.com

To be effective, a Counter-Notification must be a written communication by the alleged infringer provided to GitHubs Designated Agent (as set forth above) that includes substantially the following:

  1. A physical or electronic signature of the Subscriber;

  2. Identification of the material that has been removed or to which access has been disabled and the location at which the material appeared before it was removed or access to it was disabled;

  3. A statement under penalty of perjury that the Subscriber has a good faith belief that the material was removed or disabled as a result of a mistake or misidentification of the material to be removed or disabled;

  4. The Subscriber’s name, address, and telephone number, and a statement that the Subscriber consents to the jurisdiction of Federal District Court for the judicial district of California, or if the Subscriber’s address is outside of the United States, for any judicial district in which GitHub may be found, and that the Subscriber will accept service of process from the person who provided notification or an agent of such person.

Upon receipt of a Counter Notification containing the information as outlined in 1 through 4 above:

  • GitHub shall promptly provide the Complaining Party with a copy of the Counter Notification;

  • GitHub shall inform the Complaining Party that it will replace the removed material or cease disabling access to it within ten (10) business days;

  • GitHub shall replace the removed material or cease disabling access to the material within ten (10) to fourteen (14) business days following receipt of the Counter Notification, provided GitHubs Designated Agent has not received notice from the Complaining Party that an action has been filed seeking a court order to restrain Subscriber from engaging in infringing activity relating to the material on GitHub’s system.

Finally Notices and Counter-Notices with respect to this website must meet then current statutory requirements imposed by the DMCA; see http://www.copyright.gov for details.

A News Feed for your site's Google Analytics

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Eric Kerr | A News Feed for your site’s Google Analytics

01.28.11

Brain Dump. Google Analytics is a remarkable product that provides a ton of information. My hunch is that most users barely scratch the surface on the full potential of all its features. You can slice and dice data up in a number of ways to get breakdowns on whatever you’re interested in.

There are so many different ways to break the analytics down that it’s overwhelming to find interesting data. Google has two features that attempt to solve this problem: Create Advanced Segments and Intelligence. Intelligence is active in that it requires you to manually set up alerts on the account you’re interested in. Advanced Segments is also great but requires you to actively set up the data filters that you’re interested in. The problem is that I’m lazy, and you are too.

What I want is a service that sits on top of Google Analytics and tells me statistically interesting things about my site without me having to actively go through all the breakdowns, fiddle with the filters, and make educated guesses on the relative stats of what I’m looking at. It would be able to tell you things like:
- During the last week, visits from Colorado have increased 30% more than all other states.
- Today, your site received 2,500 pageviews from this url on Reddit.
- During the last month, your keyword traffic for “Cheap Vinyl Record Player” has increased 450%.
- Mondays are usually your highest traffic day of the week, but this Monday was 15% higher than what was expected.

Facebook created the News Feed to solve an almost identical problem: all of the information a user cares about (their friends) is scattered across hundreds of pages so lets aggregate it all into one stream, highlight what’s likely to be most relevant, and let it be a starting point to dive deeper into the site.

The service would utilize the Google Analytics Export API which I believe provides enough data to make this work. If the API doesn’t, then the service would need to collect your Google Account information so it could log into your account for you and periodically scrape the data from your reports. The later is obviously not ideal and is probably against their TOS, but similar services exist for iPhone developers that sit on top of iTunes Connect (doesn’t make it ok, but humor me).

The purpose of this separate News Feed product is to allow you to dive into interesting data in your Google Analytics account. Certainly the most commonly utilized metric is total number of pageviews as it is prominently displayed on the Dashboard, but this tool would help you dive into the lesser visited sections of your account. The interface would be something of an amalgamation of the current Google Analytics Dashboard and a traditional Facebook-like News Feed. I’ve been thinking about this for a little bit now among a few other things so I wanted to see what others thought. I’d greatly appreciate any feedback be posted to Hacker News.

Indian Government Restricts PayPal

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

PayPal India Problems Continues

Reserve Bank of India has been giving hard time to PayPal and its users in India. RBI had previously blocked PayPal transactions in India a few times, and they made it difficult to withdraw payments by enforcing exports and forex related compliance. Here is yet another bad news for Indian PayPal users.

With effect from March 1st, Indian users cannot receive payments of more than $500 in your PayPal account. Moreover, you cannot keep or use any funds in your PayPal account. You can use your PayPal balance to make send money for any goods or services, and must withdraw it to your bank account within 7 days of the receipt.

These changes have rendered PayPal almost useless for small business, webmasters and publishers. Most webmasters and publishers rely on PayPal to receive payments from advertisers and clients. It has also made it impossible to buy anything online with PayPal. Sending payments abroad via other channels is already a pain, sending a bank wire requires too many formalities, documentation and time. Moreover, you are even required to deduct TDS on payments you make for any products or services.

The restrictions will take effect on March 1st, so you have 30 days to complete any pending transactions you may have.

This step by RBI is yet another gimmick by corrupt Indian Government to make life difficult of entrepreneurs, kill innovation, slap more taxes and create more channels to take bribes.

Following is the notification from PayPal about this issue:
As part of our commitment to provide a high level of customer service, we would like to give you a 30-day advance notice on changes to our user agreement for India.

With effect from 1 March 2011, you are required to comply with the requirements set out in the notification of the Reserve Bank of India governing the processing and settlement of export-related receipts facilitated by online payment gateways (“RBI Guidelines”).

In order to comply with the RBI Guidelines, our user agreement in India will be amended for the following services as follows:

  1. Any balance in and all future payments into your PayPal account may not be used to buy goods or services and must be transferred to your bank account in India within 7 days from the receipt of confirmation from the buyer in respect of the goods or services; and
  2. Export-related payments for goods and services into your PayPal account may not exceed US$500 per transaction.

We seek your understanding as we continue to employ our best efforts to comply with the RBI Guidelines in a timely manner.

The Rise And Fall of Languages in 2010

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Dr Dobbs - The Rise And Fall of Languages in 2010

The Tiobe Programming Community Index is a gauge that has long tracked the rise and fall of programming languages. First developed during the mid-1990s, the index relies on various data sources to gauge the popularity of languages; that is, the extent to which they're used (but not how much they're liked). The primary sources of information are the top six search engines. The process of how the raw data is converted into rankings is explained in considerable detail on the index's definition page. Suffice it to say that the index is carefully crafted to track the emergence, adoption, and eventual decline of every major and most minor languages (from ABAP to Z shell). Languages have to be Turing-complete, so interlopers that appear in surveys by non-programmers — XML, HTML, and the like — don't pollute Tiobe's findings.

Over the years, the index has been both praised and vilified. The latter by language adherents who are unhappy over the decline of their favorite idiom. Like many writers who've referred to the index, I've been castigated in the past for using it to support the fact that, let's say, Perl is in an inexorable descent (so is Java, but let me not get ahead of myself). Whether you like the results or not, you can't help but be impressed by the care with which the index is compiled. Moreover, having graphs showing 10 years of language adoption is a tremendous research resource.

In January of each year, the Tiobe site provides a recap of the previous year and compares languages using five-year mileposts. All of the data points reveal interesting, if sometimes inexplicable, results.

Looking over the ten-year chart, several patterns emerge immediately. The steady decline of Java is real. Ten years ago, it made up nearly 27% of mentions; since then, it's dropped to 18%. What is less clear is the state of JVM languages as a whole. But we can make a good guess that even if the main JVM languages (Groovy, Scala, and Clojure — JRuby and Jython are broken out separately from their parent languages, alas) were added back in to Java's numbers, the total would still see a decline. Like most readers, I would expect this to be the case, and I expect most of the emigrants migrated to Ruby and Python. We'll see in a moment if this theory is supported.

Java's decline, however, has not knocked it from the top position. It now enjoys a thin lead over C, followed (after a substantial gap) by C++ and PHP. These last two languages have been exchanging positions for a long time. While they've both declined somewhat during the last year, it's too early to tell whether or not that's a trend.

Next is Python, which I'll discuss in a moment. Then comes C#, which has been gaining steadily and finally passed Visual Basic as the .NET language of choice in 2010. Visual Basic has seen perhaps the most precipitous decline of any of the top ten languages — its popularity has dropped by half over the last three years. Most developers in the Microsoft universe would probably have intuited this change, but now there are unmistakably clear numbers to support this. Visual Basic's drop in favor of C#, I believe, is a function of VB being asked to do more than grind out CRUD apps, which were its bread and butter. The more complex the app, the more C# is the .NET language of choice. (Not to be overlooked in this regard is Microsoft's excellent stewardship of the language, which makes this shift possible.) I expect this trend will continue and Visual Basic will move increasingly to the end of the .NET stable where the old plow horses are quartered.

Now comes the interesting part. Python surged hugely this last year — more than any other language. It just beat out Objective-C for greatest increase in adoption in 2010. (Objective-C's jump is likely attributable to the popularity of the iPad, which came out last year.) It's hard to find any specific reason for Python's surge, but I suspect it's the result of continued broad adoption and Google's enthusiastic backing. What is surprising is that several factors might have argued against a break-out year for Python: Two incompatible versions of the language and the uncertain fate of the Unladen Swallow project. The folks at Tiobe speculate that one cause of its newfound popularity might be that it is becoming the teaching language of choice in college courses.

I'm not convinced. I suspect part of the new adoption comes from Perl programmers who are throwing in the towel. Perl fell again last year and the slope of its collapse is holding steady. In mid-2005, the language received 10.5% of mentions; by end of 2010, this number had dropped to 2% – 3%. I expect this trend will continue, as I strongly doubt that many new green field projects would today choose Perl as the principal development language. The causes of Perl's descent into what will soon be meaninglessness are many, but I believe Python is the principal reason. Python is better at doing what Perl does, and it's an easier language to learn.

The final entrant in the top 10 is Ruby, which — surprise, surprise — declined year over year. I don't have any insight into this save to wonder whether the Ruby on Rails (RoR) jubilation has passed into a period of more sober assessment of technology. While RoR is easy to use for erecting new sites quickly, if you need to move out of the basic RoR model, things become more difficult. Moreover, RoR has not broken out of its principal playing field of SMBs, which is what it probably needed to do to continue putting up big growth numbers. This may well change with the increased traction of JRuby, which I believe is how Ruby will end up getting into many data centers where the JVM is already established. If Ruby gains ground this year, it will almost certainly be due to JRuby.

Possibly the biggest surprise in the index was that JavaScript fell out of the top 10. I confess I am baffled by this. JavaScript is the defining language of Web apps. Moreover, I keep seeing perverse instances of Atwood's law (named after über-blogger Jeff Atwood): "Any application that can be written in JavaScript, will eventually be written in JavaScript." Moreover, JavaScript, like Python, has strong corporate backers who've worked hard to optimize execution engines and distribute the language widely (for example, it's now the default language of the Java JSR-223 scripting engine).

The real benefit of the Tiobe index comes from looking at trends rather than fluctuations, and making decisions accordingly when starting new projects. Java, C, C++, PHP, C#, and Python are languages that are widely used and will have plenty of support in the years to come. For all other languages, things are less clear.

Tunisia, Egypt, Miami: The Importance of Internet Choke Points

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Tunisia, Egypt, Miami: The Importance of Internet Choke Points - Andrew Blum - Technology

AlexHoyt_illustration.png

The news yesterday evening that Egypt had severed itself from the global Internet came at the same time as an ostensibly far less inflammatory announcement closer to home. Verizon, the telecom giant, would acquire "cloud computing company" Terremark for $1.4 billion. The purchase would "accelerate Verizon's 'everything-as-a-service' cloud strategy," the press release said.

The trouble is that Terremark isn't merely a cloud computing company. Or, more to the point, the cloud isn't really a cloud.

Among its portfolio of data centers in the US, Europe and Latin America, Terremark owns one of the single most important buildings on the global Internet, a giant fortress on the edge of Miami's downtown known as the NAP of the Americas.

The Internet is a network of networks. But what's often forgotten is that those networks actually have to physically connect -- one router to another -- often through something as simple and tangible as a yellow-jacketed fiber-optic cable. It's safe to suspect a network engineer in Egypt had a few of them dangling in his hands last night.

Terremark's building in Miami is the physical meeting point for more than 160 networks from around the world. They meet there because of the building's excellent security, its redundant power systems, and its thick concrete walls, designed to survive a category 5 hurricane. But above all, they meet there because the building is "carrier-neutral." It's a Switzerland of the Internet, an unallied territory where competing networks can connect to each other. Terremark doesn't have a dog in the fight. Or at least it didn't.

Verizon insists there's nothing to worry about. Terremark will be set up as a wholly owned subsidiary. Its carrier-neutral status will remain. "We're not going to try to cramp their style at all," said Lowell McAdam, President and COO of Verizon. "There will be no moves to take certain customers out of play."

I can't help but think of it in the context of another recent purchase. Earlier this month, Google bought its New York office building, 111 8th Avenue, for a reported $1.9 billion. As the Wall Street Journal described, "about one third of the space is occupied by telecommunications companies." But that's severely understating the situation: 111 8th is another of the most important buildings on the Internet, on a short list of fewer than a dozen worldwide. Like the NAP of the Americas, it houses hundreds of independent networks, scattered across the office spaces of multiple independently owned sub-landlords. And now Google owns the whole thing. One assumes that they're not going to cramp their style either.

"It's not about the 'carrier hotel' space," said Google Senior Vice President Jonathan Rosenberg. "We have 2,000 employees on site. It's a big sales center, but also a big engineering center. With the pace at which we're growing, it's very difficult to find space in New York. There are very few buildings in New York that can accommodate our needs. This gives us a lot of control over growing into the space."

But on a day when the government to 80 million people managed to throw the Internet's "kill switch," it's worth remembering that the Internet is a physical network. It matters who controls the nodes. With these two deals, Google and Verizon may have chipped away at the foundation walls of an open, competitive--and therefore free -- Internet.

inline_NOTA.jpg

Images: 1. Alex Hoyt. 2. Terremark.


Makin' It Rain - How Raindrop Effects Work In 2D Games

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Game1, Day 19 – Makin’ it rain, unfortunately not with dollar bills!

Quickdraw

I’m doing up the rain effect today. I’m hoping to get this game to run smooth on older devices, so I wanted to make the rain easily modifyable. I’m not sure how many sprites Cocos2d can push at a time on, say, a 2nd gen iPod Touch…I’m sure I don’t have anything to worry about, I’m not making Smash TV here or anything, there’s basically a ninja, some stuff being thrown at him, and a couple background images, but I always like to stay on the safe side because it’s way nicer to find out you have excess memory at the end of the project than find out you don’t have enough haha

My early days in the game industry were spent working on cell phone games where we’d make an S60 version, which would be something like 260×320 and let us do badass full screen art and everything. But we’d have to then make an S40 version, which was like 120×208 and way less powerful so I’d have to replace artsy backgrounds with color bar gradients and stuff. We also had to do a third version…S30 maybe? Can’t quite remember. But it was a piece of crap. Likea 128×128 screen and super weak power, where we could do the absolute most minimal stuff. So I got in the habit of thinking in terms of “how can I strip this down if I have to”? I found it was easier to just work efficiently from the start.

So this is how I usually do rain…I stole this method from some RPG on some console back in the day. I have a raindrop graphic and a water splat animation:

The actual drops in-game are super long, but having a giant graphic like that is a waste so I just make a shorter version and stretch it vertically in-game…the above is for the 1024×768 iPad. I actually had a much shorter drop but I would have had to scale it like 2000% and I get nervous about how the game’s going to handle a bunch of 2000% scaled images, so I like to size it so that when I stretch it out I only have to stretch it between 200% – 500%. The rain splat is just a quick doodle of a drop blipping into some ripples.

Now I combine these by showing the raindrop randomly all over the screen, I also show the rainsplat randomly around the screen. So the drops aren’t ACTUALLY hitting the spots where the rain splats appear, but when you look at them together the overall effect is that rain is pouring. If I had a splat where each drop lands I’d have to figure out the depth/height of objects, which surfaces can get a drop, if there was 50 rain drops I’d need 50 splats as well, etc. With this method I can have 50 rain drops but only 10 splats and get the same overall effect. In the end I have about 11 drops shown at a timeand maybe 5 splats shown at a time. The layers look like so:

I figure if I run into slow-down on older devices I can just remove a few raindrops or splats until it runs smooth.

I like this method for making rain in an RPG because you can specify in your tiles which ones should get rainsplats and which ones shouldn’t (so you’d have the roof tiles on a house splattable but the wall tiles wouldn’t be).

- Quickdraw

« Game1, Day 18 – The optimal development schedule

Buddycloud's node.js server just released

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

How Organized Spam is Taking Control of Google's Search Results

In the past few months, I have been watching a very unsettling trend unfold in very competitive ecommerce search results on Google. It appears that huge amounts of money are put into place to systematically and successfully manipulate highly competitive search terms in order to sell fake merchandise of almost every bigger brand out there. Some of these sites even solely exist to steal people's money and don't ship anything at all.

Rest assured that I am not talking about some people doing good linkbuilding or about people buying a lot of links. These operations I talk about are much, much bigger and in all cases almost certainly run by criminal organizations of some sort. They not only greatly affect US search results but are also very present in at least UK, France and Germany.

In this article, I will show you several examples of where Google’s search is absolutely broken (and by broken, I mean that 10 out of 10 page one search results are entirely fraud). I will also show you exactly how these rankings are achieved and take a look at what the impact on consumer’s may very possibly be. Last but not least, I’ll try to help you recognize these kinds of websites so you can avoid them as they become increasingly difficult to identify.

Exhibit A (“nfl jerseys”)

Let’s get started with [nfl jerseys] as our first keyword to be examined. If you take a look at the US search results on google.com with personal search disabled (add &pws=0 to any search URL), you will get a list of websites which claim to sell said wear and merchandise at a significantly discounted price. Such huge discounts can be found on pretty much any of these fake shops, many ranging up to 75% in “savings”.

Here’s the search engine results page as of 01/04/2011:

NFL Jerseys SERP

(Note: I’m not trying to “out” any particular site, so I removed any domain names in question from the screenshots)

As you can easily see, all of these sites feature ridiculous keyword stuffing in their root page titles as well as the term “jersey” within their domain name. This is both very common among them. Result #5 even contains Chinese letters.

As of this writing, the entire first results page is composed of fraudulent websites. In other words, Google’s organic results have become entirely useless for this search phrase.

Exhibit B (“pandora jewelry”)

Next, [pandora jewelry], also a very popular and well-respected brand. Positions 3, 5, 7, 8, 9 and 10 are fraud, which results in a 60% share of useless results.

Pandora SERP

Exhibit C (“thomas sabo”)

Looking at [thomas sabo] SERPs, they feel like a déjà-vu. Another jewelry brand, another wave of artificially boosted shops shipping either replica ware or just nothing at all: results 3, 4, 5, 6, 8, 9 and 10 should not be listed there at all in the first place (70%).

Thomas Sabo SERP

All of these sites try to appear as legit and official as possible.

See for yourself

Before diving into the details, I urge you to take a look at Google’s results for these queries yourself. Try searching for other brands, too. Almost every popular brand is affected by this growing issue.

How they do it

Now that I’ve shown you how seriously broken Google is, let’s take a look at why Google is ranking these sites so well. Since these are no legit shops after all, it’s obvious that there is no kind of “branding bonus” or boost through actual social media activity at hand. There’s only one thing that leads to these rankings. You’ve guessed it: keyworded anchor-text heavy links.

The interesting question is though: where do these sites get their (anchor-text rich) links?

I have taken a look at many of these sites and found out that their link profiles are basically comprised of two kinds of links: automated forum and blog spam along with some hacked websites.

Let’s take a look at the anchor text variation of result #1 for “nfl jerseys”:

Anchor Text Variation "nfl jerseys"

This site also has a page authority of 64 and a domain authority of 57, according to Open Site Explorer.

Google's best guess for "pandora jewelry" looks similar:

Anchor Text Variation "pandora jewelry"

And the #1 "thomas sabo" result:

Anchor Text Variation "thomas sabo"

You might be surprised to see plain-old forum spam work this well, but let me get one thing straight: it’s not like Google is not penalizing or de-indexing any of these sites. I see them come and go on a daily basis (although some actually seem to stick for weeks or even months).

However, these people (or rather organizations) push such huge amounts of these sites into the web that Google - obviously - is having quite a hard time catching up.

In some way, and this is my personal opinion, this might be related to the Caffeine update - Google is now crawling and ranking sites a lot faster than ever before, but it appears overall search quality has suffered dramatically in the past 6 months or so.

Furthermore, link placements on hacked websites are very difficult to spot algorithmically. Granted, many of those links are not visible to the human eye and that should raise some flags since Google is capable of rendering any page, but overall it’s not comparable to catching automated posts on tens of thousands of web forums.

What really should have set Google's alarm off, though, are the link growth patterns. Let's take a look at the "nfl jerseys" top 3:

"nfl jerseys" link growth

Two of these sites started spamming back in April 2010 and are still ranking in January 2011. Go figure.

Same goes for "pandora jewelry" and "thomas sabo":

"pandora jewelry" link growth

"thomas sabo" link growth

You get the picture.

What Google needs to do about it

Rand talked about it already, and his advice is instantly applicable to this issue: Google needs to greatly lower the value of keyword-rich anchor texts.

Think about it: if Google had not at all taken anchor text into account for these sites, none of them would probably rank anywhere near the top 10 results. Their links come from very different sources, and almost none of those sources is even remotely related to what their pretending to be selling.

As long as anchor text links outrank links from actually related websites, this is not going away anytime soon. Same goes for exact match keyword domains, by the way.

I do realize that anchor text is very important, but its abuse has reached a point where it’s no longer a ranking signal to be trusted as much as it currently is. Heck, I've actually seen websites rank #3 for these terms with one single sentence on the page: "seized by Department of Homeland Security".

What it means for SEO

Google has a serious problem, and I’m sure that they have been working on it relentlessly for quite some time now.

What it means for SEO is that whatever is working for the sites mentioned in this article - it will probably stop working soon. I would not be surprised to see Google shift even more ranking signal power from anchor-text heavy links to relevant social media “chatter”. I have a feeling that it’s gaining more traction as we speak.

Of course, tweets and status updates can be spammed, bought and faked, too. But at least it will buy Google some time.

This fight is never over nor ever "won" by anyone. Ever.

How to identify these sites as a consumer

Since I don’t want any of you to order from these guys and receive either fake goods or nothing at all, here’s some advice to identify them:

Most of these sites:

  • offer unrealistic discounts (>=50% are pretty much everywhere)
  • have no actual postal address
  • supply only a contact form or
  • supply only a GMail/Hotmail email address to contact them
  • feature way too many “trusted logos” in their footer
  • are written in poor English

Considering that most of the sites I talked about earlier already ranked well while all the holiday shopping took place, I can only imagine the damage done to thousands of families and individuals.

Please be cautious and remember that if a deal sounds too good to be true, it very probably is.

Since this is my first article for SEOmoz, please let me know in the comments if you liked this article and give me a “thumbs up” if you did. In case you’ve even been affected by this kind of fraud personally, I’d love to hear from you, too.

- Rouven Balci, SEO at Toms Gutscheine

Is Quora Ready for Pop Music?

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

slash blog - AP's thoughts and musings » Blog Archive » Is Quora Ready for Pop Music?

Posted: January 28th, 2011 | View Comments

I am not a word, I am not a line
I am not a girl that can ever be defined
-Nicki Minaj

I drank theQuora kool-aid early and now that Ive been on the site for about a year I have decided to test its limits.

If the site is really meant to be a repository for that which is not Googleable it will need to satisfy more broad intellectual curiosity beyond just thedown-to-the-last-detail history of the Facebook Like Button.

So I decided to do some tests with the ultimate in inexplicable phenomenons: pop music. I was the most loyal of loyalz100 listeners as a kid, but have since lost interest in the minutae of Top 40 hits. That said, the rise to hyper-popularity of certain pop stars in recent years, especially given the complete and utter fragmentation of distribution channels, is fascinating.

One such star isNicki Minaj, the newest it-girl pop rapper. Minaj is a bit of a contradiction a rapper who sports pink hair and lacks the street background that one assumes is a pre-requisite for a successful rap career. And yet, she is arguably the most successful female rapper in the past year (or 5 years?). So what is it about Minaj that makes her so successful?

Figured it was the perfect question for Quora. Very non-Googleable, and the type of question that is subjective and filled with subtleties. Theres also been some interesting controversy stemming from a comment that producer Irv Gotti made about Minaj, declaring she wasjust as talented as Lauryn Hill.

The answers on Quora were interesting and thoughtful, and not only that, but they came from people who I would classify as informed and influential in the world of pop music. The accusation that Quora is simply a Silicon Valley Playground is, I believe, not entirely true. But could it be that the tech world so insular that we have made the naive assumption that every industry is as narcissistic and analytical as ours?

That said, other verticals outside of tech still have a long way to go in terms of the depth and insight in answers. I was sort of hoping for someone to detail for me the entire history of women in rap music (including the very significantmid-1990s throw-down between Lil Kim and Foxy Brown) and the conditions that specifically led to Minajs ability to rise so quickly.

Its an interesting time to be a female rapper. Bieberism and generally bubble-gum style pop has reached fever pitch, and it would seem that the next great wave of hip-hop might be grittier in response, and yet the opposite has happened.

While Minaj may not be the next Lauryn Hill the best part about her is that she doesnt want to be. Shes received some criticism for not doing her own stuff, though perhaps that has allowed her to leverage the existing fan bases of giants like Kanye West and Eminem to cultivate her own.

I wonder if because this sort of information has no clear answer, no answer will ever be completely satisfying. Perhaps that will ultimately be the downfall of Quora (though I believe that not at all and relish the highly speculative musings that are posted on the site).

Its surprising to me that people dont get the Quora hype or dont buy the excitement and value of the service. This, to me, makes no sense. Through what other medium can you access this type of expertise? Sure, I could go read a bunch of articles in pop mags and blogs, but they generally tend to avoid pontificating in favor of actual reporting.

And good for them but I think theres this whole other kind of information that has yet to be catalogued in any sort of organized fashion. If you have a site that does metric conversions, youve probably missed the boat on that, as that information easily translates to a digital medium and was the first type of information to go digital in the early days of the web. Quora is about subtlety and gray areas, and this phase of information categorization is just beginning.

Maybe Quoras not ready for pop music quite yet, but when they do get there Ill be waiting.

Prolog Introduction for Hackers

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

A Prolog Introduction for Hackers || kuro5hin.org

[P] A Prolog Introduction for Hackers

By tkatchev in Technology
Thu Feb 26, 2004 at 05:03:23 PM EST
Tags: Software (all tags) Software

Strange, but true: Prolog is, without a doubt, currently the simplest and the most straightforward programming language of all mainstream programming languages; however, the special interests of academia and inept teaching have given it a horrible, pariah-like reputation. (After all, you cannot write a PhD thesis explaining obvious, practical things.) This article aims to ameliorate the situation; to introduce the practical simplicity of Prolog to those that might normally sneer at what they consider a horrible, convoluted playing field of doctorate theory.

The Prolog approach

Prolog is, essentially, a query language for databases, like SQL. However, unlike SQL, which is a limited query language for relational databases, (tables of rows and columns, much like a spreadsheet) Prolog is a query language for matching complicated patterns against a database of simple facts.

Thus, all Prolog programs consist of three parts: a list of facts, a list of pattern matching rules (sometimes also called predicates in Prolog jargon) and a list of queries. (Also sometimes called goals in Prolog jargon.)

Prolog facts

Facts, in Prolog, are pre-defined patterns that get stored in Prolog's internal database, usually in a manner to make searching them more efficient.

There are, essentially, three types of values in Prolog:

  • Numbers.

    Ex: 1, -2, 0.5, -0.776.

  • Symbols, which are, for all intents and purposes, immutable strings of lower-case letters without special characters or spaces. (We'll explain why symbols must be lower-case later.)

    Ex: hello, world, this_is_a_symbol, atom3.

  • Linked lists of symbols or numbers. Lists are untyped.

    Ex: [hello, cruel, world], [1, 2, 3], [1, hello, 2, world], [].

Modern, practical Prolog implementations define many more useful datatypes; however, this is implementation-dependent and doesn't really matter as far as this article is concerned. Look up your Prolog implementation's manual if you are interested.

Facts, or pre-defined patterns, are written using the standard notation of functions or procedures in other programming languages. The symbol before the opening parenthesis is the pattern's name, with a list of comma-separated values inside the parentheses.

Ex: f(hello), greeting_message(hello, world), g([hello, world]), fac(3, 6).

Note that the following is illegal in Prolog: f(). Patterns without arguments are written without parentheses, like this: f.

Also note that pattern arguments can have any of Prolog's datatypes, thus symbol, number and list arguments are allowed.

Thus, to define a fact in Prolog, all you need to do is to write a Prolog program that lists pre-defined patterns with a period (full-stop) after each entry.

Example of a Prolog program that defines several facts:

hello.
world.
f(hello, world).
g([hello, world]).
standard_greeting([hello, world], 2).

This simple program inserts five pre-defined patterns into Prolog's internal database. ( hello and world, two patterns without arguments; f(hello, world), a pattern f with two arguments; g([hello, world]), a pattern g with one argument, a list; and standard_greeting([hello, world], 2), a pattern standard_greeting with 2 arguments, a list and a number)

When several patterns are defined with the same name and the same number of arguments, Prolog will run through them, one after another in a top-to-down fashion, when trying to match them. (You can think of this as a short-circuited "OR" of pattern matching rules.)

Defining pattern-matching rules

Defining pattern-matching rules in Prolog is equally simple:
f(hello, world) :- g([hello, world]).

Whenever Prolog sees the special symbol :-, Prolog creates a new pattern-matching rule. The basic meaning of :- is very simple: to match whatever is to left of the :-, the part to the right must be matched. This allows to "decompose" complex patterns into smaller, more manageable ones.

To make the task practical, Prolog defines many operators that help us in the task of composing pattern-matching rules. Some of the more important and useful are:

  • ,: A comma denotes sequential matching of patterns; this is equivalent to a "short-circuited AND" in many imperative programming languages. ( C's &&, for example.)

    Ex: f :- a, b, c.

    This means that to match the pattern f, we need to match, in order, patterns a, b and c.

  • ;: A semi-colon denotes choice; this is equivalent to a "short-circuited OR" in many imperative programming languages. ( C's ||, for example.)

    Ex: f :- p; q; r.

    This means that to match the pattern f, either p must be matched, or, if Prolog fails to match p, try to match q; if matching q fails, finally try matching r.
    Note that the semi-colon is essentially equivalent to listing patterns on separate lines; thus,
    f :- (p; q).
    is equivalent to
    f :- p.
    f :- q.

  • ->: An arrow denotes a conditional pattern rule, in other words, an "if-then-else" rule.

    Ex: f :- (g -> h; i).

    This code means that Prolog first tries to match pattern g; if the pattern can be matched, try to match the pattern h. If g cannot be matched, try to match the pattern i.
    Note that the construct must be enclosed in parentheses, due to strangeness in Prolog's syntax rules.

  • \+: This is equivalent, in a sense, to the negation operator in many programming languages. (Like the C !.)

    Ex: f :- \+ g.

    This code means that the pattern f matches whenever the pattern g cannot be matched.



Variables

This is all very easy to understand, but is, unfortunately, useless in real-world applications. What we lack are variables. Variables in Prolog are symbols that begin with a capital letter; for example: Var, A, Q, MATCH_ME.

Whenever Prolog comes upon a variable, it starts searching in its internal database of facts and pattern-matching rules for a substitution such that substituting a value for the variable matches some fact.

A simple example will illustrate the concept better. Consider the following Prolog program:

i_know(hello).
i_know(world).
is_phrase(A, B) :- i_know(A), i_know(B).
is_greeting(A, B) :- is_phrase(A, B), A = hello.

This program defines two facts ( i_know(hello) and i_know(world)) and two pattern-matching rules.

is_phrase(A, B) :- i_know(A), i_know(B). This rule, which is named is_phrase and which accepts two arguments, tries to find substitutions for the two variables it uses ( A and B) such that the pattern on the right side of the :- matches against Prolog's internal fact database.

In this particular case, Prolog will find the following substitutions:
A=hello, B=hello
A=hello, B=world
A=world, B=world
A=world, B=hello

is_greeting(A, B) :- is_phrase(A, B), A = hello. Again, a pattern of two arguments. As before, Prolog will try to find substitutions for the two variables A and B such that the pattern rule matches.

The new concept here is the operator =. This operator is Prolog's equivalent of variable assignment.
P=Q means that Prolog will find substitutions for variables such that the two arbitrary patterns to the left and to the right of the = are exactly equal. Note that in this operator Prolog doesn't care whether or not the two patterns can be matched against its internal database; what matters is that the two patterns become equal after = finished its work. The = operator is commutative; thus, A=B and B=A mean the same thing. If such a substitution cannot be found, the = operator will fail to match. For example, hello=world will always fail to match.
Thus, after executing i_know(A)=i_know(foo) A will be substituted with foo even though i_know(foo) does not match against Prolog's internal database. (By the way, this procedure is often called unification in Prolog jargon; thus, A=hello means that A will be unified with hello.)

Finally, we can figure out what the pattern is_greeting(A, B) does. Here, Prolog searches for substitutions for A and B such that a match against the known facts is found.
Expanding all the pattern-matching rules, Prolog will find the following substitutions:
A=hello, B=hello
A=hello, B=world

As you can see, using just a few basic Prolog operators and Prolog's advanced search engine, we can build pattern-matching rules of arbitrary complexity. It is here that Prolog's power really shines though: Prolog allows to build very complicated matching rules very easily. Thus, Prolog is designed for use in applications where we have just a few very simple facts, but a lot of very complex search rules.

Contrast this with SQL, which has been designed for the opposite situation: a great amount of very complex data with very simple, very basic search rules.

Programming in Prolog

While the basic Prolog engine is enough to perform arbitrarily complex searches, it is not enough to use Prolog as a general-purpose programming language.

At the very least, what we miss is a way to do basic input/output, a way for handling arithmetic and a way for doing loops.
In true Prolog spirit, the designers of the language decided to not complicate the language with unnecessary constructs and unnecessary syntax, but instead to write simple, basic hacks that integrate very well with the basic Prolog query language. (Contrast this with Oracle's PL/SQL, for those that know it.)

Input and output in Prolog is done with special pattern rules that always match, and produce output or return input as a side effect. Since there are a great number of implementation-specific input and output functions, I will describe two of the most basic, to give the general idea. If you want to learn more, consult the manual of your Prolog implementation.

Outputting a value is very simple: the pattern-matching rule write(A) will output its argument. This pattern-matching rule always matches. For example, write_greeting(A, B) :- is_greeting(A, B), write(A), nl, write(B), nl. will output two words on the screen, provided that the two words are a greeting. (The nl pattern simply outputs a newline character.)

Basic input is equally simple: the pattern-matching rule read(A) will ask the user to input a value, using Prolog syntax, substitute the typed value for A and match successfully. For example, this simple pattern-matching rule of zero arguments will simply output whatever value the user typed: echo :- read(A), write(A), nl.

For arithmetic, Prolog uses a special operator called is. This operator is just like the = operator, except that on the right side must be an arithmetic expression, not a pattern. For example, A is B + 1 will substitute a value for A that is equal to B + 1. Due to the special syntax of the is command, it is not commutative. Thus, B + 1 is A will give an error. This means that there is always a variable to the left of the is.

Note, however, that the is operator is still nothing but a special pattern-matching rule; so, for example, A is A + 1 is an invalid expression that will likely give an error, since no number can be substituted such that the number becomes equal to itself incremented by one. If this makes little sense, try making some simple substitutions in your head: 5 is 5 + 1 violates the basic rules of arithmetic.

Loops in Prolog are very simple; like other well-known programming languages, looping is done by writing recursive pattern-matching rules.

An example: infinite_loop(A) :- write(A), nl, infinite_loop(A). This rule will run an infinite loop that prints its argument infinitely many times. Using recursive applications of pattern-matching rules, any looping construct can be expressed. Recursion is equivalent to looping, as any programmer of functional languages knows.

Prolog queries

Queries, or goals, are a way for the user to interact with Prolog's internal database and pattern-matching rules. Almost all Prolog implementations include an interactive shell where the user can type arbitrary patterns; Prolog then tries to match the given pattern, outputting whatever variable substitutions are needed to match the pattern. This provides an easy and powerful way to interact with Prolog in real-time, typing in search queries and getting back results immediately.

Note, however, that the majority of Prolog implementations do not allow defining new patterns interactively; a Prolog program must be written using a separate editor and loaded into the interactive Prolog shell.

Query syntax, like almost everything in Prolog, is elegantly simple. Here is an example of an interaction, based on the previously shown program:

Ciao-Prolog 1.9 #44: Mon Dec 30 16:47:15 2002
?- ensure_loaded('i:/ivan/foo.pl').
yes
?- f(hello).
yes
?- f(foo).
no
?- is_phrase(hello, world).
yes
?- is_phrase(hello, _).
yes
?- is_greeting(A, B).
A = hello,
B = hello ? ;
A = hello,
B = world ? ;
no
?- is_greeting(_, B).
B = hello ? ;
B = world ? ;
no
?- hello.
yes
?- foo.
{ERROR: user:foo/0 - undefined predicate}
no
?-

The ?- is the standard Prolog prompt; it means that the user is invited to type in a Prolog pattern, ending with a period, as all Prolog statements. Note, also, that Prolog returns either a yes or a no after each query; a yes means that Prolog was able to match the query against its internal database; a no means, respectively, that Prolog was unable to find a match. (Notice that trying to use an undefined pattern foo caused a Prolog error.) When Prolog encounters variables in the query (like A and B in this example) Prolog prompts us before returning each found value for the variable, one by one. Finally, there is a special variable called _, which means that we do not care about the values of this variable and do not want them printed.

An extended example

To solidify your understanding of the abstract underpinnings of Prolog, here are a few very simple programs in Prolog. (In Prolog, the % character denotes comments.)

Calculating the factorial:

% fac(N,R) : R is the factorial of N.
fac(0,1).
fac(N,R) :- P is N-1, fac(P,Q), R is N*Q.

Using this pattern is simple: typing the query fac(5,R)., for example, gives the result: R = 120. When playing around a little, though, many deficiencies start becoming apparent, including unwanted infinite loops and the fact that fac(N,120) doesn't give the expected result. The reason for this is the fact that Prolog is very ill-suited for numerical computation; arithmetic in Prolog is a hack, and doesn't integrate well into the Prolog search engine too well.

Addendum: Here is a tail-recursive version of the factorial pattern.

% fac2(N,A,R) : R is the factorial of N; A is the "accumulator" value.
fac2(0,R,R).
fac2(N,Acc,R) :- P is N-1, Q is N*Acc, fac2(P,Q,R).

Usage: fac2(5,1,R).

Reversing a list:

% cat(A, B, C) : C is the concatenation of lists A and B.
cat([], R, R).
cat([H|T1], Z, [H|T2]) :- cat(T1, Z, T2).
% rev(A, B) : B is the list A when reversed.
rev([], []).
rev([H|T], Q) :- rev(T,P), cat(P,[H],Q).

(Here we first encounter the special Prolog syntax for handling lists; the special system pattern [H|T] matches any list whose first element is H with T being the rest of the list without the first element.)
Using this pattern is simple: rev([1, 2, 3],R). In other words, find all such R that rev([1, 2, 3],R) matches the Prolog pattern-match rule database. Notice, however, that rev(R,[1,2,3]) gives us the same result! This is obvious if you think a little bit about the nature of Prolog's pattern-matching engine.

As an example, here is a second, simpler version of rev that doesn't rely on concatenating lists:

% rev2(A, B, C) : C is the reversal of the list A, C is the "accumulator"
% for growing lists.
rev2([], R, R).
rev2([H|T], Z, R) :- rev2(T, [H|Z], R).

Here, the second parameter of the pattern-matching rule is used as an "accumulator" for holding intermediate values. Using this rule is simple: rev2([1, 2, 3], [], R). Look at how smart the Prolog pattern-matching is when it processes lists and other symbolic data: not only does rev2(A, [], [1, 2, 3]) produce the expected results, but rev2([1, 2, 3], B, [3, 2, 1]) produces B = [], and rev2([1, 2, 3], A, [3, 2]) returns us a pattern-match failure.

Addendum

Wait one second, Prolog is a logic-based language!

No, it is not. Prolog has several system patterns that were (very) loosely inspired by formal logic; these were designed to ease the use of Prolog for those people that are already familiar with formal logic. However, if you persist with your delusion that Prolog is a language for "handling logic predicates", you will get bitten sooner or later! (And probably sooner than later.)

Standard Prolog

Prolog is a very established, industrial-strength, popular language. As such, there is a very clear and formal ISO standard for Prolog interpreters and compilers. You can view the standard for ISO Prolog here, for example.

Strange language

Prolog programmers and implementors simply love using non-standard, confusing and sometimes plain wrong language. Do not be afraid of this peculiar trait; when encountering strange or confusing terms, be aware that 90% of the time a very simple and down-to-earth concept is hiding behind it.

The "cut" operator

The so-called "cut" operator (written as !) is a pre-defined operator in Prolog that allows the programmer to subtly tweak the search strategy used by the Prolog matching engine; while this operator allows some neat programming tricks and optimisation techniques, I do not advise anyone ever to use this operator. It is a dirty, very confusing and dangerous hack that will inevitably make your program impossible to read and introduce subtle bugs. In short, avoid the "cut" like the plague.

Open-Source Prolog

Prolog is a programming language with a great deal of choice as far as implementations are concerned; there are lots and lots of good, high-quality implementations of Prolog that are Open-Source.

Try it yourself

You can try interacting with a Prolog shell without installing a Prolog implementation here. (The link points to tuProlog, an Open-Source Prolog written on top of the JVM and embeddable into standard Java applications.)

For fun

Just in case you are wondering how Prolog can be used in the "real world", take a look at this simple rouguelike game written entirely in Prolog: Caves of Golorp.

Though of course, Prolog is very useful for non-toy applications as well. Browsing through the site of any commercial Prolog vendor, (and there are lots of them) you will inevitably stumble upon a page listing many serious, industrial-grade applications in Prolog.

Warning to Americans

Prolog is decidedly a European language; not only was it originally invented by a Frenchman, the majority of Prolog implementations are developed and used in Europe; and even in a purely academic setting, Prolog is much more popular as a teaching language in European universities when compared to higher education in the U.S.

Despite this, do not be afraid; Prolog is a very useful tool in its own right. Good luck.

Sponsors

Voxel dot net
oManaged Hosting
oVoxCAST Content Delivery
oRaw Infrastructure

Login Make a new account Username: Password:
Note: You must accept a cookie to log in. Poll Programming Language:
o C 9% o C++ 9% o Lisp-derivative. 12% o Pascal-derivative. 3% o Java 14% o .NET 3% o Prolog 1% o ML-derivative. 5% o Python 17% o Perl 9% o Something "visual". 1% o Other, procedural. 4% o Other, functional. 2% o Other. 2% o No reply. 1% Votes: 219 Results | Other Polls Related Links ohere
ohere [2]
otuProlog
oCaves of Golorp
oand there are lots of them
oAlso by tkatchev

Achieving a profitable product/market fit

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

The practicalities of CSS Media Queries, lessons learned.

CSS Media Queries are part of CSS3 and in brief: they allow you to tailor your websites appearance for different screen sizes. Most people I speak to know about media queries but have been a little shy in trying them out. You can read about media queries elsewhere—here, I'd like to speak about a first-hand experience of using them.

 

Should you or shouldn't you?

Last year sometime, we argued about how best to deliver the mobile experience of Bloop to our users. There were two really only two options, both valid, yet utterly conflicting:

1) Use media queries. For the user: she visits http://bloop.co/ and instead of viewing the regular webpages, the page content is styled appropriately for small screens. For us: it's one website, one view of the content, and just another css file to adapt the content for mobile viewing.

Pros: 

  • Less work for us to do initially, and a total piece of cake to maintain.
  • New features and content go straight to the mobile site, (it is after all just the same site).

Cons:

  • Everything served to a laptop or desktop is going to get pumped down the 3G connection to mobile.
  • Media queries won't work on older browsers (Nokia, Blackberry, I'm looking at you).
  • Any dependance on javascript or images can't be assured to run (again, Nokia, Blackberry).

2) Make a specific mobile website. For the user: she visits something like http://m.bloop.co/ and views a mobile-optimised website. For us: this is effectively another website, meaning another set of templates and styles. At the time, it was suggested that we could do some sniffing for the browser type and redirect them automatically to http://m.bloop.co/ if they were visiting us from a mobile browser.

Pros:

  • We can remove all the javascript files and non-critical content and serve the most lightweight pages to mobile, for fastest possible page loads.
  • We can ensure the site loads on most phones, but it's going to be almost text only, no css floats, no javascript etc.

Cons:

  • Users need to know to go to m.bloop.co, or else we need to rely on user-agent sniffing (long since shown to be a terrible idea).
  • It's a buttload of work.
  • It'll work on most mobiles alright, except if you have an iPhone or Android, you're going to browsing a site that looks like it's from 1985.
  • Features will take longer to get to mobile, no matter what we promise ourselves. We have effectively added another platform to support.

Twice, historically, we chose Option 2 in the name of page load times and device compatability. This weekend, I tried Option 1, because we don't have time for Option 2. What follows is my experience, opinions and tips.

 

My experience with media queries

In an hour, and with around 60 lines of CSS, I had the mobile site 80% complete. Our website is generally a fixed width style, and mobile needed to be liquid, and this was the main chunk of that CSS.  Any non-critical content got cut. For example, you can't edit your profile on your iPhone now. There was a lot of  "display: none" to hide this content but I'd attempt to deal with that later, somehow (remember, this means it's still be downloaded to the device, it's just not displayed on the screen).

Another hour and perhaps another 100 lines of CSS. Everything has shrunk down to fit to screen, font-sizes are adjusted for better legibility and page architecture has changed from a two column layout to just one so that it flows from top to bottom, a bit like this: Nav > Main Section > Optional Secondary Section > Footer.

Two more hours and 200 lines of CSS. Critical UI's (creating an event, inviting friends) needed some special consideration for the mobile experience. When you access these parts on mobile now, you get a sort of pop up that fills most of the screen. Will come back later to adjust after I've used it for a while.

So maybe four hours in, we're at about 300 lines of css in one separate css file, and using the website on my phone is now pretty delightful. This is so obviously a superior option. Absolutely zero has changed to the core code. And what of the Blackberry and Nokia users? Blackberry is moving to Webkit. I guess we can always do a mobile-only site for you when we have more time (and if you actually need it) but otherwise, at least the "heavy" mobile web users have something and it took me an evening.

 

Useful Tips

Here's some of the more complicated stuff, that I thought I'd share to save anyone time, and I couldn't find much about online.

Mobile Safari auto zoom on inputs and textareas

The iPhone (and maybe Android) zooms in on a text area when you focus. That's really annoying, especially for some parts of this UI where you need to see the full width. Opinions online say use the meta tag for user-scalable=no, which means users can't pinch and zoom. Don't do that, that's a really bad decision, and isn't solving the actual problem which is:

If your font-size in the textareas/inputs is less than 100%, Mobile Safari zooms in proportionately to make the font-size equal 100%. 

Fix it by setting the font-size in the textareas to 100% or you can use -webkit-text-scale-adjust: x% (it must be a percentage; neither em nor px). I originally had body font-sizes for mobile at 80%, so the zooming stopped happening with -webkit-text-size-adjust: 125% (and above). Naturally, 125*80=100, or whatever.

Removing the Safari wrapper on a web app

You can remove the Safari wrapper with a <meta> tag for "web-app-capable=yes". Don't be tempted. I bookmarked the Bloop web app to my Home screen, and it runs without Safari chrome indeed, but clicking links actually launches Safari to view these new pages. Some chat about that elsewhere. Overall, it's not a wise thing to do to your users e.g. you lose the refresh button, and the familiar page-loading indicators. They should be using what they know; keep the safari window there.

Get rid of jQuery and CSS animations and transitions

Remove all of your jquery animations, slides, fades, and -webkit or -moz-transitions.  I tested on iPhone 3GS, iPhone 4, and Nexus S and the first two ran painfully sluggishly, the latter wasn't exactly a peach. It really defied the point of the animations being there in the first place, so they got cut. This is a bit of a pain in the ass:

body * {

-webkit-transition: none !important

-moz-transition: none !important 

}

but now you need to visit some of your jquery. I decided to look for any transition and animations and swapped in an if:

if (screen.width > 480) {

slideUp/slideDown/fadeIn/fadeOut

} else {

show/hide

}

 

However, you can't test that on your computer because it's unlikely that your monitor is 479px wide. For testing purposes I switched from measuring 'screen.width' to 'window.innerWidth' and later realised that media queries are actually checking those same dimenions (window size, not screen size) so I left it as that. 

A general, perhaps obvious tip

I've learned in the last week or so, that CSS is really where you need to do *all* of your aesthetic control, and when you move into media-queries, this becomes much more obvious. JQuery inserts styles in-line, which will overpower any css. To keep it clean, as a basic example instead of $element.hide() and $element.show(), I now use $element.addClass('hidden') and $element.removeClass('hidden'). where in your css file you obviously have something like .hidden { display: none; }. 

FadeIn, fadeOut, and slideUp, slideDown aren't too easy to do with CSS, so if you particularly need those, then use them I guess.

@font-face on mobile

@font-face, ah yes, my difficult but dearly-beloved friend. @font-face is awesome for the desktop/laptop experience so long as you follow Paul Irish's advice. However, I soon learned, that Paul's bulletproof method on mobile, ain't so bulletproof. The local() part freaks out Android, and it stops there; no @font-face'd fonts came through. There's an easy fix though, you've got your mobile.css now for doing special changes to narrow screened devices right? Just remove the .eot and local() stuff (that's there to support IE). So my @font-face bulletproof adaption now looks like:

@font-face {

font-family: "pictos-web";

src: url("type/pictos-web.woff") format("woff"),

url("type/pictos-web.ttf") format("truetype"),

url("type/pictos-web.svg#webfontIyfZbseF") format("svg");

font-style: normal;

font-weight: normal;

}

I make no comments on the actual fact of using external fonts on mobile. In my case, I'm using vector icons, not fancy typography. I'm waiting out on this argument until I find some stronger evidence one way or the other.

Remaining unsolved issues

The only issue which I've thought about and haven't had time to think of an easy solution for, is that normally at the bottom of mobile sites you have an option like: (Mobile | Standard). Immediately that strikes me as awkward option to now provide since to to load the standard stylesheet I'll need to fake the screen width. I'm sure javascript provides something but it feels like I'm hacking something I shouldn't. I'd like to hear if anyone has a solution for that.

 

Conclusions

Just try it, you'll like it.

It shocks me that Posterous doesn't do this. Posterous is "PRIME" for this sort of move. But they seem to have gone with Option 1, and I'm not sure that's the best choice for anyone targetting smartphones. 

Hope this was useful to some of you, I'm looking forward to hearing how our users feel about this, tell me what you think: http://bloop.co/ (shrink your computer browser or use your phone), and of course hit me up with any corrections and feedback. Thanks.

@smcllns

 

Syria Internet Down As Egypt Blackout Catches On In Middle East

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Tech News and Analysis «

Amid spreading protests, the Egyptian government has taken the incredible step of shutting down all communications late Thursday. Only a handful of web connections, including those to the nations stock exchange, remain up and running.

Its an astonishing move, and one that seems almost unimaginable for a nation that not only has a relatively strong internet economy but also relies on its connections to the rest of the world.

But how did the government actually do it? Is there a big kill switch inside Egyptian President Hosni Mubaraks office? Do physical cables have to be destroyed? Can a lockdown like this work?

Plenty of nations place limitations on communications, sometimes very severe ones. But there are only a few examples of regimes shutting down communications entirely Burmas military leaders notably cut connectivity during the protests of 2007, and Nepal did a similar thing after the king took control of the government in 2005 as part of his battle against insurgents. Local Chinese authorities have also conducted similar, short-lived blockades.

The OpenNet Initiative has outlined two methods by which most nations could enact such shutdowns. Essentially officials can either simply close down the routers which direct traffic over the border hermetically sealing the country from outsiders or go further down the chain and switch off routers at individual ISPs to prevent access for most users inside.

In its report on the Burmese crackdown, ONI suggests that the junta used the second option, something made easier because it owns the only two internet service providers in the country.

The Burmese Autonomous System (AS), which, like any other AS, is composed of several hierarchies of routers and provides the Internet infrastructure in-country. A switch off could therefore be conducted at the top by shutting off the border router(s), or a bottom up approach could be followed by first shutting down routers located a few hops deeper inside the AS.

A high-level traffic analysis of the logs of NTP (Network Time Protocol) servers indicates that the border routers corresponding to the two ISPs were not turned off suddenly. Rather, our analysis indicates that this was a gradual process.

While things arent clear yet, this doesnt look like the pattern seen in Egypt, where the first indications of internet censorship came earlier this week with the blockades against Twitter and Facebook but when access disappeared, it disappeared fast, with 90 percent of connections dropping in an instant.

Analysis by Renesys, an internet monitoring body, indicates that the shutdown across the nations major Internet service providers was at precisely the same time, 12:34am local time:

Renesys observed the virtually simultaneous withdrawal of all routes to Egyptian networks in the Internet’s global routing table The Egyptian government’s actions tonight have essentially wiped their country from the global map.

Instead, the signs are that the Egyptian authorities have taken a very careful and well-planned method to screen off internet addresses at every level, from users inside the country trying to get out and from the rest of the world trying to get in.

It looks like theyre taking action at two levels, Rik Ferguson of Trend Micro told me. First at the DNS level, so any attempt to resolve any address in .eg will fail but also, in case youre trying to get directly to an address, they are also using the Border Gateway Protocol, the system through which ISPs advertise their internet protocol addresses to the network. Many ISPs have basically stopped advertising any internet addresses at all.

Essentially were talking about a system that no longer knows where anything is. Outsiders cant find Egyptian websites, and insiders cant find anything at all. Its as if the postal system suddenly erased every address inside America and forgot that it was even called America in the first place.

A complete border shutdown might have been easier, but Egypt has made sure that there should be no downstream impact, no loss of traffic in countries further down the cables. That will ease the diplomatic and economic pressure from other nations, and make it harder for protesters inside the country to get information in and out.

Ferguson suggests that, if nothing else, the methods used by the Egyptian government proves how fragile digital communication really is.

What struck me most is that weve been extolling the virtues of the internet for democracy and free speech, but an incident like this demonstrates how easy it is particularly in a country where theres a high level of governmental control to just switch this access off.

Image courtesy of Flickr user Muhammed Ghafari

Engaging Recruiters with Hacker Trading Cards

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Hacker Trading Cards - Hackthology

The spring career fair at Case Western Reserve University is coming up next week. Instead of collecting swag from all the employers, we decided to make Hacker Trading Cards to give to companies as CWRU Hacker Society swag! We would like employers looking to hire CS students to give talks at Hacker Society this semester and thought this was a creative way to get their attention.

This wasn't intended to be a comprehensive collection, so what cards would you have created? What other creative ways would you suggest for student groups to engage companies?

 

 

 

 

 

 

 

Click here to download:

hacker-trading-cards-tBwelmFuxtvcEzrjHDAG.zip (4049 KB)

If you'd like to make your own Hacker Trading Cards, fork our gist! 

2600 back issue prices reduced by 60%

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr


Subscribe to 2600!
Get 2600 Stuff!
20TH CENTURY BACK ISSUE PRICES SLASHED
Posted 25 Jan 2011 05:18:04 UTC When you publish a magazine for over 25 years, the time is going to come when you start to run out of space. That time has arrived for us and we need to clear out space for new issues, new projects, and occasional fugitives. So we've cut the prices of our oldest remaining issues, that is, those published during the 20th century. For us, that period of time was between 1984 and 2000. Back issue prices from that period have been reduced by 60 percent to a mere $2.50 apiece for those that we haven't completely run out of yet.

We haven't forgotten about the 21st century, though. Bulk issue discounts for that period of time (2001-2010) are in full effect, along with specials on complete sets coupled with lifetime subscriptions.

As 2600 back issues never really get outdated (in fact, we believe that the older ones are the most fascinating regarding all of the many changes in technology), this is a great chance to learn about our history and the various toys we've been playing with from one year to the next. And if we get some more breathing room in the process, everybody wins.

Visit the 2600 store here.





Get rich quick ads from 1909

January 28, 2011

/**/ /**/ This Hacker News full feed lovingly brought to you by the team behind LazyReadr

Popular Mechanics - Google Books

0 ReviewsWrite review

About this magazine
Browse all issues
Subscribe

   Search all issues

View all magazines

Add to My Library ▼

All issues

Loading...

Published by Hearst Magazines. 

  

Next Page

We Can Do Better: The Overlooked Importance of Professional Journalism

January 28, 2011

This Hacker News full feed lovingly brought to you by the team behind LazyReadr

The 3 most important things I learned from Google (part 2)

none of which are likely to be endorsed at Business School
This is the second of three posts.

2) Operate as if you exist in a vacuum.

Most websites that make money do so by placing ads on their pages...and there’s a good chance that those ads are provided by Google. But it wasn’t always that way. Back when Larry and Sergey were at Stanford they wrote a paper called The Anatomy of a Large Scaled Hypertextual Web Search Engine http://infolab.stanford.edu/~backrub/google.html, where they expressed their skepticism and concerns about monetizing a search engine with advertising.

Having come from Netscape, I was quite familiar with ad models on the web. Banner ads at 2% average click-thru rates and text links at 0.1%. These could be sold by category, but likely were mostly run of site ads (they show up wherever there is a free space regardless of what page it is). This was the era of the “punch the monkey” banner ad. Basically in 1999 advertising on the web was so untargeted that click-thrus began to tail off. The novelty of clicking on banners was largely over. The solution that creative folks came up with was to make them little games or colorful seizure-inducing flashing ads that you would click on just to get rid of them. Yes, the click occurred, but no transaction was going to happen at the offending advertiser’s website. Ads that cost money but don’t produce a sale are bound to drop in rates (price)... and that was what was generally happening.

I was the website bizdev guy so I was on the inaugural ad design team. I was one of the few people sitting around the conference room table when Larry [Page] walked in and declared that “ads can be as relevant as search results” and then basically walked out. I scratched my head a bit and remember thinking “what does that mean?”. What it meant was that we were in no hurry to make money from our website. It meant that we weren’t going to have graphical banners on our site. It didn’t matter that all of the advertisers of the world were used to hiring an agency, having a small number of creative banners produced and then distributed to the pages of the web. It didn’t matter that not only were our ads going to be different, but it was going to be an entirely different sale and require changing buyer behavior and a whole new process. We were not going to consider how the entire industry worked, we were just supposed to build this thing as if the rest of the world and its established practices didn’t exist...as if we were operating in a vacuum.

Larry was right on. The higher qualified/targeted an ad is, the better it will perform and the higher the rates will be. Better=more revenue. What he realized was that you can’t get any more targeting potential than someone explicitly telling you what they are looking for. That is what you do when you go to a search engine. If you have an advertisement that answers that question, it actually adds value to the user. The ad becomes relevant to the search and is a reasonable place to click. Of course there was a problem. Graphical banner ads are not customizable and no agency is going to produce a multitude of banners to match against individual keyword searches. Our solution was to use text ads instead of graphical banners (which had the added bonus of being very fast at a time when internet connections were still relatively slow). We created a very simple online interface which would allow advertisers to create as many custom text ads as they wanted and experiment with which keyword phrases to advertise to. We wanted them to be able to learn what keywords to buy which resulted in the most clicks, not only because it would mean more revenue for us, but because more clicks meant the ads were relevant to the searcher. Changing user behavior, in this case the entire advertising world is not easy. What what we knew and ultimately figured out was that we needed to economically incentivize advertisers to want to do a better job advertising to our users. All advertisers wanted the top ad positions on the page, but we made it that they couldn’t simply buy there way to the top. What we did was said that their position amongst all the ads was relative not only to what they bid for their ad, but cross referenced against the click-thru rate for each specific ad against each specific search term used by the searcher. You could pay less for a better slot if you had a better and better targeted ad. We also encouraged the advertiser to link to the relevant page within their own site that related to the search, not just to their home page (i.e. if you searched Google for "wedding china", an ad for William Sonoma might appear. A click on the ad would lead the searcher to the wedding china page, not the williamsonoma.com homepage, which is what used to occur). This type of thinking made the whole experience better. And the rest as they say, is history.

My take-away is that:

Google completely disregarded the way the entire advertising industry worked and came up with its own solution as if it existed in a vacuum. And ultimately, it worked. Sometimes to find the right answer you have to look at a problem and work out a solution which completely disregards current standard practices.