Tuesday, August 21, 2007

What does Orkut have to do with crimes?

When I came home from work, I found Star News Asia reporting on how Orkut is a channel for depravity and crime. There were blurbs like "Bach ke rehna" (Hindi: save yourself, or stay away), "Maut ki website" (Website of death), "Kishoron ko kiya ja raha hai gumraha" (Youngsters are being misled), "Orkut par ankush jaroori?" (Should Orkut be censored?).

It started with a report of a young girl murdered by a "friend" she met on Orkut. Google news shows similar cases being covered in various media. Too many to list. But what was strange about this news story was the exclusive focus on Orkut and blaming it for:

  • A means of disseminating information to its users about rave parties (where all-night dancing and drugs are mixed together);
  • Misleading youngsters: while Orkut started as a means of connecting people, it has (according to the news story) become a hive of people talking about sex, rave parties, drugs, etc.
I find it strange to see Orkut being targeted so exclusively and being portrayed as promoting these activities. Several objections come to mind:
  • So this girl met this guy online, befriended him, and met him in a hotel room alone; the guy turned out to be not that friendly and caring and murdered the girl. Agreed, one knows less about people you meet online. That requires some care and judgment on behalf of the user, no?
  • This meeting could have happened off line. I know of people being harmed by friendly looking acquaintances. What about that?
  • Is Orkut a motivator for rave parties, or are people wishing to go to rave parties using Orkut as a medium?
  • Why Orkut, and not other social networking sites? Because it is more popular in India? If Orkut were not popular, would people have not used any other social networking platform? If none were available, would they not use email lists, IRC channels, BBSes? Shall we ban them?
  • Agreed, all kinds of users interact online and some are not so moral. Is it Orkut's responsibility for what people post and tell each other? Is it a parent/guardian's responsibility or Orkut's that young children are not exposed to adult content?
  • The internet is a dangerous place. Many sites and forums contain adult content. Places other than Orkut are used to organize parties, some of them "not so nice". Shall we block the internet?
The news story seems to me to be more an attempt to sensationalize and demonize and less to point to a problem, inform people, or offer a balanced view.

Here is where we could go from here:
  • Let's ban Orkut. Users will go to MySpace. Lets ban that, next. Then Facebook. Then ...
  • Finally we will have stomped out the evil social nets. Next target: BBSes and forums. They make it too easy for people to reach each other.
  • Ergh... we forgot about personal web-pages. No more personal websites. All media to be controlled and all publications to be approved by Almighty Censorship Alliance of Sensational News Channels.
  • Oops, we forgot about email. Bye bye, you nasty little disseminator of information about parties, drugs and personal details.
  • Now we plug the network cable out of the PC.
  • Ban books: too many porno flimsies around.
  • No more phone: you better come home and get interviewed by the family before you meet my daughter.
  • No talking to strangers on the roadside.
  • No roads.
Funny, I don't see anything about responsible and judicious use of communication anywhere in the list. I guess only news channels know how to do that. Ornery folks like us should just ban everything.

Sunday, August 19, 2007

Erlang bit syntax wrinkles

The Erlang bit syntax is wonderful, if complicated.

Say you have a Binary like:

84> Bin1 = <<1,0,2>>.
<<1,0,2>>

Then, this will match:
85> <<A:1/binary, B:8, C:1/binary>> = Bin1.
<<1,0,2>> %% A => 1, C => 2


But this will not match:
86> <<A:1/binary, B:1/binary, C:1/binary>> = Bin1.

=ERROR REPORT==== 19-Aug-2007::17:46:17 ===
Error in process <0.139.0> with exit value: {{badmatch,<<3>>},[{erl_eval,expr,3}]}

** exited: {{badmatch,<<1,0,2>>,[{erl_eval,expr,3}]} **

It seems that I can specify a literal '0' as an int type (the default type) but not a binary type (see http://erlang.org/doc/reference_manual/expressions.html#bit_syntax for the gory details). Wonder why?

Friday, August 17, 2007

GNU Emacs, SLIME, Allegro

For some reason, I decided that I just had to switch to GNU Emacs from XEmacs on my WinXP laptop. So I went and downloaded Emacs 22.1 for Windows and installed it.

Setting it up for Erlang was easy: I added this to C:\Documents and Settings\Parijat\.emacs:


;; Erlang:
;; Add the location of the elisp files to the load-path
(setq load-path (cons "C:/erl5.5.4/lib/tools-2.5.4/emacs" load-path))
;; set the location of the man page hierarchy
(setq erlang-root-dir "c:/erl5.5.4")
;; add the home of the erlang binaries to the exec-path
(setq exec-path (cons "c:/erl5.5.4/bin" exec-path))
;; load and eval the erlang-start package to set up everything else
(require 'erlang-start)


Now, to setup Common Lisp. I use Allegro CL 8.0 Free Express Edition. (Yes, one day I'll buy the full version and repay Franz, but only after getting myself Lisp Certified ;-)) I was following the instructions at http://www.franz.com/emacs/slime.lhtml and http://common-lisp.net/project/asdf-install/tutorial/setup.html .

This is what I did:

1. Opened up ACL and loaded asdf into it:


International Allegro CL Free Express Edition
8.0 [Windows] (Aug 17, 2007 22:59)
Copyright (C) 1985-2006, Franz Inc., Oakland, CA, USA. All Rights Reserved.

This development copy of Allegro CL is licensed to:
Trial User

CG version 1.81.2.23 / IDE version 1.80.2.21
Loaded options from C:\Documents and Settings\Parijat\My Documents\allegro-prefs.cl.

;; Optimization settings: safety 1, space 1, speed 1, debug 2.
;; For a complete description of all compiler switches given the current optimization settings
;; evaluate (EXPLAIN-COMPILER-SETTINGS).

[changing package from "COMMON-LISP-USER" to "COMMON-GRAPHICS-USER"]
CG-USER(1): (require :asdf)
; Fast loading C:\Program Files\acl80-express\code\ASDF.001
T



2. Then, downloaded asdf-install and unpacked it into "C:\Documents and Settings\Parijat\asdf-install" (the source tarball has several nested directories called asdf-install; the innnermost contains the .asd file, and that's the one I am referring to here).

3. Told asdf about this directory:

CG-USER(4): (pushnew "c:/Documents and Settings/Parijat/asdf-install/" asdf:*central-registry* :test #'equal)


and compiled and loaded asdf-install:

CG-USER(5): (asdf:operate 'asdf:compile-op 'asdf-install)
CG-USER(6): (asdf:operator 'asdf:load-op 'asdf-install)


4. Well, all this stuff was in the end to get SLIME. I went to http://common-lisp.net/project/slime/#downloading and found the link to the latest SLIME. Now, on Windows, it seems asdf-install can't download and install tarballs off net (at least on Allegro CL). Then I downloaded and put the slime tarball in C:\Temp. Then:

(setq asdf-install::*verify-gpg-signatures* nil)
(setq *cygwin-bash-program* "c:/cygwin/bin/bash.exe")


and then:


CG-USER(7): (asdf-install:install "C:/Temp/slime-2.0.tgz")


This did not work, complaining:


Warning: Cannot find tar command "C:\\PROGRA~1\\Cygwin\\bin\\bash.exe".
Error: Unable to extract tarball C:/Temp/slime-2.0.tgz.


It seems my mod to *cygwin-bash-program* did not take effect inside asdf. Wonder why. I even quit Allegro, restarted it, defined *cygwin-bash-program* before compiling and loading asdf-install. No luck. (One thing I did not try was to define asdf-install:*cygwin-bash-program* instead of vanilla *cygwin-bash-program*). So I just went and hardcoded tar-command() in installer.lisp to return "C:\\cygwin\\bin\\bash.exe".

Now the above command works.

5. The Franz documentation instructs me to start "mlisp.exe" or "alise.exe" LISP image (see next step). But my ACL 8.0 express did not come with these images; it only has "allegro-ansi.exe". So, following the README in the acl80-express directory, I create the images, by starting allegro-ansi.exe, and in the "Debug window", typing the following:

CG-USER(1): (build-lisp-image "sys:mlisp.dxl" :case-mode :case-sensitive-lower
:include-ide nil :restart-app-function nil
:restart-init-function nil)
CG-USER(2): (sys:copy-file "sys:allegro-ansi.exe" "sys:mlisp.exe")


Now, I have a mlisp.exe and mlisp.dxl. Sure enough, clicking on mlisp.exe launches a console LISP without the GUI.

6. Now I have to load slime into Emacs. According to the instructions on the Franz webpage, I created a file "C:\Documents and Settings\Parijat\.slime.lisp", with the contents:

(load "C:/Documents and Settings/Parijat/.asdf-install-dir/site/slime-2.0/swank-loader.lisp")

(sys:with-command-line-arguments (("p" :short port :required)
("ef" :long ef :required))
(restvar)
(swank::create-server :port (parse-integer port :junk-allowed nil)
:style :spawn
:dont-close t
:coding-system (or ef "latin-1")))


and added this to my .emacs:

;; Allegro LISP + SLIME
(push "c:/Documents and Settings/Parijat/.asdf-install-dir/site/slimw-2.0/" load-path)
(require 'slime)
(slime-setup)

(setq slime-multiprocessing t)
(setq *slime-lisp* "C:/Program Files/acl80-express/mlisp.exe")
(setq *slime-port* 4006)

(defun slime ()
(print "this is my slime")
(interactive)
(shell-command
(format "%s +B +cm -L \"c:/Documents and Settings/Parijat/.slime.lisp\" -- -p %s --ef %s &"
*slime-lisp* *slime-port*
slime-net-coding-system))
(delete-other-windows)
(while (not (ignore-errors (slime-connect "localhost" *slime-port*)))
(sleep-for 0.2)))


I restart emacs. I say "M-x slime". And wait. The mlisp.exe window shows up, but no connection.

I figure that the problem is in (swank:create-server :port ...) which is supposed to start the swank server, does not work. If I run it in the mlisp.exe window, the command does not return (okay, maybe it is not supposed to). I can't telnet to port 4006 on the localhost, though.

Puff!!! It used to be simpler.

Tuesday, August 7, 2007

Concurrency: Erlang vs Java

Its fashionable to extol the high-performance of Erlang concurrency these days. Programming Erlang, Section 8.11, has this problem:

Write a ring benchmark. Create N processes in a ring. Send a message
round the ring M times so that a total of N * M messages get
sent. Time how long this takes for different values of N and M.
Well, I copied off the Java program from here: http://www.sics.se/~joe/ericsson/du98024.html

And then I wrote this Erlang program:
-module(threadperftest).
-compile(export_all).

timeit() ->
register(mainproc, self()),
io:format(" Procs, Mesgs, SpawnTotal, SpawnProc, RunTotal, RunMesg~n", []),
timeit_aux(1000, 1000, 10000),
init:stop().

timeit_aux(NProcs, NMsgs, NProcsMax) ->
if NProcs =<>
main([NProcs, NMsgs]),
timeit_aux(NProcs+1000, NMsgs, NProcsMax);
true -> void
end.

main([N, M]) ->
{_, _W0} = statistics(wall_clock),
FirstPid = spawn(fun loop0/0),
% io:format("First: ~p~n", [FirstPid]),
LastPid = setup(N-1, FirstPid),
FirstPid ! LastPid,
{_, W1} = statistics(wall_clock),
LastPid ! M,
receive
stop ->
void
end,
{_, W2} = statistics(wall_clock),
io:format("~6B, ~6B, ~10g, ~10g, ~10g, ~10g~n", [N, M, W1*1.0, 1.0*W1/N, W2*1.0, W2*1000.0/(M*N+N)]).

setup(0, PrevPid) ->
PrevPid;
setup(HowMany, PrevPid) ->
Pid = spawn(fun() -> loop(PrevPid) end),
% io:format("~p: Linking to: ~p~n", [Pid, PrevPid]),
setup(HowMany - 1, Pid).

loop0() ->
receive
LinkedPid when is_pid(LinkedPid) ->
% io:format("First: ~p: Linking to ~p~n", [self(), LinkedPid]),
loop1(LinkedPid);
X ->
erlang:error({unknownMessageError, X, self()})
end.

loop1(LinkedPid) ->
receive
M when is_integer(M), M > 0 ->
% io:format("~p: Received: ~p~n", [self(), M]),
LinkedPid ! (M-1),
loop1(LinkedPid);
M when is_integer(M), M =:= 0 ->
% io:format("~p: Received: ~p, Terminating~n", [self(), M]),
mainproc ! stop,
done;
X ->
erlang:error({unknownMessageError, X, self()})
end.

loop(LinkedPid) ->
receive
M when is_integer(M), M > 0 ->
% io:format("~p: Received: ~p~n", [self(), M]),
LinkedPid ! M,
loop(LinkedPid);
M when is_integer(M), M =:= 0 ->
% io:format("~p: Received: ~p, Terminating~n", [self(), M]),
LinkedPid ! M,
done;
X ->
erlang:error({unknownMessageError, X, self()})
end.
The performance is something like this (all times are in milliseconds, except the last one, which is in microseconds):
 Procs,  Mesgs, SpawnTotal,  SpawnProc,   RunTotal,    RunMesg
1000, 1000, 0.00000e+0, 0.00000e+0, 266.000, 0.265734
2000, 1000, 16.0000, 8.00000e-3, 562.000, 0.280719
3000, 1000, 0.00000e+0, 0.00000e+0, 969.000, 0.322677
4000, 1000, 0.00000e+0, 0.00000e+0, 1672.00, 0.417582
5000, 1000, 15.0000, 3.00000e-3, 2469.00, 0.493307
6000, 1000, 16.0000, 2.66667e-3, 3234.00, 0.538462
7000, 1000, 16.0000, 2.28571e-3, 4015.00, 0.572998
8000, 1000, 16.0000, 2.00000e-3, 4750.00, 0.593157
9000, 1000, 31.0000, 3.44444e-3, 5469.00, 0.607060
10000, 1000, 31.0000, 3.10000e-3, 6375.00, 0.636863
The numbers in the RunMesg column are the time it takes to send one message from one process to another. Why is it increasing? Does the runtime have to do more work to find the target process?

Here is the Java output (program below):
 Procs,  Mesgs, SpawnTotal, SpawnProcs,   RunTotal,   RunMesg
100, 1000, 16.000, 0.160, 15531.000, 15.531
200, 1000, 62.000, 0.310, 27047.000, 27.047
300, 1000, 94.000, 0.313, 37828.000, 37.828
400, 1000, 125.000, 0.313, 46859.000, 46.859
500, 1000, 141.000, 0.282, 54578.000, 54.578
600, 1000, 203.000, 0.338, 60594.000, 60.594
700, 1000, 234.000, 0.334, 65282.000, 65.282
800, 1000, 406.000, 0.508, 68156.000, 68.156
900, 1000, 484.000, 0.538, 69782.000, 69.782
1000, 1000, 735.000, 0.735, 70015.000, 70.015
The last column is in milliseconds whereas for Erlang it was in microseconds, AND the number of procs is 100-1000, whereas for Erlang they were 1000-10,000.

Monday, August 6, 2007

Programming Erlang exercise problem: trouble in the concurrent universe?

Going through Programming Erlang, encountered this exercise problem in Section 8.11:

Write a function start(AnAtom, Fun) to register AnAtom as spawn(Fun). Make sure your program works correctly in the case when two parallel processes simultaneously evaluate start/2. In this case, you must guarantee that one of these processes succeeds and the other fails.
Here is my first solution, and it seems to compile and run.
-module(register_function).
-export([start/2]).

start(AnAtom, Fun) ->
case whereis(AnAtom) of
undefined -> register(AnAtom, spawn(Fun));
_ -> error
end.
Now, is there race condition in the above? I mean, consider Processes A and B both executing start/2:
  1. Process A calls whereis(AnAtom) which evaluates to 'undefined';
  2. Process B calls whereis(AnAtom) which evaluates to 'undefined';
  3. Process A calls register(AnAtom, spawn(Fun)), which succeeds and returns true;
  4. Process B calls register(AnAtom, spawn(Fun)), which fails as AnAtom has been registered
Superficially, we have satisfied the requirements of the problem: Process A succeeded and B failed. But the sinister problem is that spawn(...) has been called by both processes anyway. Thus we have two instances of Fun being executed, with one of them in an anonymous process that I, at least for now, don't know how to access.

Ironically, the problem seems to be because presumably register puts AnAtom in some "global" table. Exactly the kind of concurrency problem that Erlang tries to avoid.

How would this be solved? Here is an idea:
  1. Create a "register_function" server;
  2. start/2 sends a message to this server with AnAtom and Fun as contents
  3. presumably the server processes one message at a time and the code between matching a message and executing some expressions is atomic with respect to other messages to the same server
Puff. You'd think Erlang would come with built in functionality for this. Is this important?