hisham hm

Migrating my projects from SourceForge+Subversion to Github+Git

It’s been a while that I’ve been using Git comfortably — even though I still get stumped now and then, the workflow is much better than Subversion. I do finer grained commits with ease and I’m no longer afraid of making branches. Still, some projects of mine are still hosted in svn servers, mostly due to inertia and because they’re mostly single-person projects with the occasional patch and bug report but not much collaboration. For a more “social” project such as LuaRocks, git and Github have proven to be very useful tools.

Farewell to SourceForge

This, however, was the push I needed to move my projects away from SourceForge. From gimp.org:

In the past few months, we have received some complaints about the site where the GIMP installers for the Microsoft Windows platforms are hosted.

SourceForge, once a useful and trustworthy place to develop and host FLOSS applications, has faced a problem with the ads they allow on their sites - the green “Download here” buttons that appear on many, many adds leading to all kinds of unwanted utilities have been spotted there as well.

The tipping point was the introduction of their own SourceForge Installer software, which bundles third-party offers with Free Software packages. We do not want to support this kind of behavior, and have thus decided to abandon SourceForge.

From now on, Jernej Simončič, who provides the installer packages, uploads them to our FTP directly, and from there they will be distributed automatically to our mirrors. Please check Downloads page for updated information. http://gimp-win.sourceforge.net will remain active for the time being and direct users to the new download locations.

This saddens me a lot. I’ve been a SourceForge user for over 13 years. Having my first project there with a “.sourceforge.net” URL was a moment of pride; I felt I was becoming a free software developer “for real”, seeing my code there along with all those other projects I relied on daily, such as GIMP.

htop, in particular, has lived a beautiful life in SourceForge:

It blows my mind to think that these are direct downloads of the source code only. I assume the vast majority of users install htop running the distro package manager (apt-get, yum, etc.) but I have absolutely no way to estimate how many times this program has been installed. And that’s not only okay, it’s beautiful: it’s the nature of free software, the actual freedom, at work. There is no tight grip on users from a central authority, the code is roaming free in what is essentially a network of solidarity. I’ve certainly benefited from this network much more than I’ve contributed to it, but I’m happy to give a small part.

Another achievement I’m very proud of: a perfect 5-star rating score for htop. Surely not that many reviews when compared to really big projects, but it’s noteworthy to me at least, and I’m thankful to everyone who rated.

So, thanks for everything SourceForge, but it looks like it’s time to move on.

Say hello to Github, htop

I’ve already got a bunch of projects in Github so this change should not be traumatic.

I’m converting the repositories to Github and moving the website to my own domain. I’ve started with dit, which is a low-profile project, and everything went smoothly. So it’s time to do the same with htop.

The process is straightforward:

  1. Here are the Instructions for importing from Subversion from Github
  2. They recommend using svn2git. It took me a few tries to get the conversion perfect. I recommend taking a look at the generated repository with gitk and creating a file called ~/.svn2git/authors containing aliases matching your svn usernames with Github equivalents.

    My authors file looks like this:

    loderunner = Hisham Muhammad <hisham@gobolinux.org>
    hisham = Hisham Muhammad <hisham@gobolinux.org>
    

    And the command-line for svn2git:

    svn2git svn://svn.code.sf.net/p/htop/code/
    
  3. The next step was to create a new Github repository. I did that through their web interface.
  4. And then, to push the code from the locally generated git repo to Github:
    git remote add origin https://github.com/hishamhm/htop.git
    git push -u origin master
    
  5. Next, I wanted to download the htop website from SourceForge. I usually edited things straight through their shell account, but this got me to an SFTP session in which I could fetch the files:
    sftp loderunner,htop@frs.sf.net

    (By the way, I actually use yafc, which I highly recommend, instead of sftp. Also, as I looked for its URL to post on this paragraph, I just learned that the original yafc 1.1.1 from 2001, which I still use and was hosted in SourceForge and sat there for years without updates, has been taken by a new group of developers who resurrected the project and develop it in Github.)

    The site for htop is really simple so moving it to a new location was a relatively quick process; just needed to grep all references to sf.net and svn, and get my own direct Paypal donation button (to replicate the one I had in SourceForge I used this and this).

  6. Getting the archive of releases was a bit trickier. It’s also in frs.sf.net, but you need to know the correct directory. It doesn’t show up when you log in and ls. You have to cd to it:
    yafc ftp://loderunner@frs.sf.net
    # then, in the sftp session:
    cd /home/frs/project/htop
    get -p -r htop
    

    Now I have an archive of all releases, with their (mostly accurate) historical timestamps!

  7. The final steps are shutting down the sf.net services, such as the bug tracker and the code repository. This can be done easily through their admin interface. I’ve never had any plans to migrate the bug tracker history; this simplifies things (and I don’t even know if it’s possible). I’m not shutting everything down immediately, though: until I make a proper new release and distro maintainers catch up with the new URLs, I’ll keep the download links active there.

    The original website, however, is now gone and replaced with the following:

    <html>
    <head>
    <meta http-equiv="refresh" content="0; url=http://hisham.hm/htop">
    </head>
    </html>
    

    (I did try an .htaccess file, but the sf.net servers kept redirecting to a default page even when I tried to do it via html… thanks to Lynx I realized it was redirecting with a 403 Forbidden error, so I removed the .htaccess file and things worked as expected.)

  8. Still TODO: migrate the htop-general mailing list.

One criticism I’ve had of Github in the past was that it did not promote a culture of making proper releases like SourceForge always did, but that has improved in recent years. Github now has a “releases” feature which, while not perfect, does the job in many cases. I’m not sure if I’ll use it, since I prefer to make the tarballs for htop using the make dist feature from GNU Autotools. I hope to cause as little disruption as possible to the distro maintainers and I want to keep my packages looking the same.

The big test for the new setup will be the next release, which I hope to make in time for htop’s tenth anniversary(!), in the coming months. It is quite fitting that htop presents its new home in such a special occasion!


Merging a Git feature branch that was already published on Github

Today I had a Git dillema, which I posted about on Twitter:

git help! I’ve got two local branches (X and master) and their remote counterparts in github. What’s the right way to merge X into master?

Here’s the situation in more detail: I had a “feature branch” for the new luarocks doc command, and I published it on Github to get some feedback. So, at that point I had four branches: in my machine I had master and luarocks-doc. But I also had master and luarocks-doc in github as well. I’ve been pushing master into origin/master and luarocks-doc into origin/luarocks-doc… I thought about doing a straightforward local merge but history could get messy and people would complain I didn’t do it right.

Alexander Gladysh came to the rescue, with the following sequence, which he listed to me through IM and worked flawlessly:

git fetch
git checkout master
git merge --ff-only origin/master
git checkout luarocks-doc
git merge --ff-only origin/luarocks-doc
git rebase master
git checkout master
git merge --ff-only luarocks-doc
git push origin master

As he said, “then you want to either update or delete luarocks-doc in the origin”:

To update:

git push -f origin luarocks-doc

To delete:

git push origin :luarocks-doc

The –ff-only flag tells merge not to perform incomplete merges (as it happened to me so often!) so it only runs it if the command would perform its task completely.

Alexander also gave me some links to material on Git workflow models, which I’ll be sure to read:

Hope that helps you as Alexander helped me. Sharing it forward!


How to write Lua modules in a post-module() world

Our beloved function module() is really going away. As of Lua 5.2 it’s only available with compatibility flags on, and the writing’s on the wall: it is going away for good in Lua 5.3. So, in a new Lua project I wrote this past semester, I decided to write it without using module(), while making sure my code runs on both Lua 5.1 and 5.2 (as a side result, I started the compat52 project, which allows you to write code in a well-behaved Lua 5.2 style and make it run on both 5.1 and 5.2).

So, why I liked module() in the first place? While vilified by some, I think the pros of module() largely trumped its cons. It had indeed some nice properties:

So, how to try to retain some of these properties without module()? The solution I found was to adopt some bits of policy, which I list below.

Yes, bits of policy. I know many in the Lua world hate policies, but of course I’m not putting a gun against anyone’s head to follow them. I’m only sharing what works for me and hopefully you may find some use. And don’t worry it’s nothing too esoteric, and it’s mostly cherry-picking some established practices.

Starting from the outside in

Keeping in mind that the goal of a module is to be required by client code, this is how a module foo.bar will be used:

local bar = require("foo.bar") -- requiring the module

bar.say("hello") -- using the module

An interesting observation here is that although we have a hierarchical structure of modules, the practice of loading them into locals means that in use they have to be accomodated in a flat namespace. So here’s Policy Bit #1:

Policy Bit #1: always require a module into a local named after the last component of the module’s full name.

Don’t do stuff such as local skt = require("socket") — code is much harder to read if we have to keep going back to the top to check how you chose to call a module.

Naming modules

Now that you know that your module will end up in people’s locals, please take that into consideration when naming your module. (I wish we had a capitalization policy to separate that nicely, but naming things LikeThis in Lua tends to be used only for object-oriented code.)

The idea is to choose a name that strikes a balance between convenience and uniqueness, and that is usable. And what better way to achieve this other than using this name. So, here’s Policy Bit #2, let’s use the module name in its declaration!

Policy Bit #2: start a module by declaring its table using the same all-lowercase local name that will be used to require it.

So, in the beginning of module foo.bar (which will live in foo/bar.lua), we begin with:

local bar = {}

It’s not a nice self-documenting header as we used to have with module("foo.bar", package.seall), but it’s something. We can improve that with LDoc comments:

--- @module foo.bar
local bar = {}

Don’t name your module something like “size”.

Declaring functions

When I’m scrolling through source code, I like to be able to tell what’s the sphere of influence of the piece of code I’m looking at. Is this a tiny helper function that’s only used below this line in this file? Is it an important function that’s used by other clients, so that an added or removed argument would mean API breakage? Ideally I like to be able to tell that without running back and forth in the code, so I really like visibility to be explicit in the syntax.

We must not declare global functions (or globals of any type, really!) in our modules, so using “globals vs. locals” to tell the difference won’t cut it. We have some alternatives, though. But first, let’s assert one thing:

Policy Bit #3: Use local function to declare local functions only: that is, functions that won’t be accessible from outside the module.

That is, local function helper_foo() means that helper_foo is really local.

This sounds obvious, but there are advocates of declaring all functions, public and private, as local functions and then writing an “export list” at the bottom of the module. Reading code written like this feels to me like a thriller with a twist ending: “haha, I was a public function all along!”

How to write public functions then? We must not declare global functions, but there are alternatives. Say we’re writing a function that will be used in client code as bar.say("hello"). It’s nice that we can declare it just like that:

function bar.say(greeting)
   print(greeting)
end

Policy Bit #4: public functions are declared in the module table, with dot syntax.

Visibility is made explicit through syntax. This is the same idea advocated by those who tell you to name your module tables “M”, except that you’re eating your own dogfood and using the name you expect your users to use. It’s also more consistent, since calls of say() are written bar.say() everywhere, instead of say(), M.say(), etc. (Also, “M.” looks really really ugly and people can’t decide if they want to use “M” or “_M”.)

In case you have speed concerns about having your calls go through the module table: first, this is what your users will go through; second, this is no different than using colon-syntax and dispatching through self and nobody complains about that; third, if you really need it (and have a benchmarked case for it), sure go ahead and make locals for optimization clearly marked as such; fourth, if you’re really after speed you’re probably using LuaJIT and last I heard the value of caching functions into locals is put into question there.

Classes and objects

When talking about classes and objects, it’s then time to talk about things named LikeThis. (If you don’t do OOP, feel free to skip this section!)

As we did above, let’s start look from the outside in: how to instantiate an object. There are two common practices (oh why I am not surprised :( )… you either make a class table with a “new” method, or make the “class object” callable (as a function or a table with a __call metamethod — wait, that makes it three practices…)

local myset1 = Set.new() -- style 1
local myset2 = Set() -- style 2.1 (set is a function)
local myset3 = Set() -- style 2.2 (set is a table)

If your module represents a class, I tend to like style 1 better because:

If all your module does is define a class, I guess it makes sense to name the module file MyClass.lua and have the class table be the module table. But I prefer not to do that, because often what we store as “static” class methods in purely OOP languages are really module functions. I still use the uppercase table when implementing the class, like this:

--- @module myproject.myclass
local myclass = {}

-- class table
local MyClass = {}

function MyClass:some_method()
   -- code
end

function MyClass:another_one()
   self:some_method()
   -- more code
end

function myclass.new()
   local self = {}
   setmetatable(self, { __index = MyClass })
   return self
end

return myclass

It’s easy to see in the code above that the functions with MyClass in their signature are methods. Sometimes it’s nice to declare the functions as fields inside the table declaration, but declaring methods separately as in the example above allows you to keep local helper functions closer to where they’re used.

If all the module does is declare the class, the class and module table may be one and the same. If you want to use style 2, we get something like this:

--- @module myproject.MyClass
local MyClass = {}

function MyClass:some_method()
   -- code
end

function MyClass:another_one()
   self:some_method()
   -- more code
end

local metatable = {
   __call = function()
      local self = {}
      setmetatable(self, { __index = MyClass })
      return self
   end
}
setmetatable(MyClass, metatable)

return MyClass

Both methods are acceptable, as long as it’s easy and obvious to tell you’re doing OOP:

Policy Bit #5: construct a table for your class and name it LikeThis so we know your table is a class.

Policy Bit #6: functions that are supposed to be used as object methods should be clearly marked as such, and the colon syntax is a great way to do it.

Don’t make people reading your function have to guess (or look up) if it is a method, a public module function or a local function.

Wrapping up

Return the module table. It’s a bit of boilerplate, but it’s what we have to deal with in a module()less world:

return bar

Policy Bit #7: do not set any globals in your module and always return a table in the end.

To sum it all up, a complete module foo.bar would look like this:

--- @module foo.bar
local bar = {}

local function happy_greet(greeting)
   print(greeting.."!!!! :-D")
end

function bar.say(greeting)
   happy_greet(greeting)
end

return bar

The result is that we type a bit more than we did with module(), and we risk polluting the global namespace if we’re not careful, but with this set of policies, we have:

…which mostly matches what I liked about module(), to the extent that can be done without _ENV tricks.

I’ve been using these policies successfully in a university project, and my plan is to follow them when I update the LuaRocks codebase to drop the use of module(). Consider your self encouraged to adopt some or hopefully all of them, but most importantly, whatever you do, be consistent! Good luck in this brave post-module() world!


Micro-retrospectiva 2013

Da minha humilde meta de ler 12 livros por ano, esse ano consegui 7. As notícias boas são que é o melhor resultado dos últimos 3 anos, e que li pelo menos um livro em cada uma das quatro línguas que eu tento ler. (Além disso, tenho conseguido manter a média de ao menos uma HQ por ano). Aqui a lista atualizada.

en_US Coding places: software practice in a South American city - Yuri Takhteyev
fr_FR L’Étranger - Albert Camus
es_AR Ficciones - Jorge Luis Borges
pt_BR O crepúsculo dos ídolos, ou Como filosofar com o martelo - Friedrich Nietzsche
pt_BR Notas do subsolo - Fiódor Dostoiévski
pt_BR Toda poesia - Paulo Leminski
en_US The mythical man-month - Frederick Brooks

Esse foi um ano em que eu praticamente não encostei em instrumentos musicais (dos 365 dias não devo ter encostado em instrumentos mais do que… 20 dias?), o que significa que eu devo estar tecnicamente pior em todos eles. Não passou em branco, porém. A produção concluída do ano foi uma música, “Little Crystal“:

Pra completar a parte cultural, em termos de viagens o ano foi fantástico. Uruguai, Rússia, França, Suíça, Portugal. Praticamente terminei de “pintar” a Europa ocidental no mapa, salvo parte da Escandinávia e alguns micro-países.

O ano acadêmico foi ótimo, ótimo: paper publicado no SBLP, qualificação do doutorado, palestra no RIT em Moscou, Lua Workshop, e cadeiras de Prog II e Linguagens Formais na PUC. Algumas dessas coisas eram planejadas, outras foram oportunidades que superaram todas as expectativas.

Feliz ano novo!!!