четверг, 8 августа 2019 г.

My Nix FAQ (in progress)

WARNING: I'm not a Nix expert, there can be false and misleading statements.

It is implied that you have used Nix, know what Nix can do and what Nix expression language is, understand basic concepts like "closure", "nix store" and "build hash".

Q: Why Nix and not OSTree/Flatpak/Docker/another edgy software deployment solution?
A: In addition to its advertised advantages, I believe that Nix preserves a valuable property of classical package managers which competitors seem to discard: structural clearness of the managed system. Every upstream project is represented by separate nix function, bound to separate attribute in the scope, producing separate paths in /nix/store when built and installed; dependencies are easily analyzeable. Besides, Nix is flexible like Gentoo Portage, it's possible to write nix package expressions which allow altering package and its dependencies set similarly to Portage USE flags.

Q: Where are all these nix expression files located in my Nix system and how Nix knows about them?
A: Nixpkgs snapshots are in /nix/store, Nix finds them via NIX_PATH env var and symlinks. Nix uses "path names" (<>-bracketed) which are translated to actual paths using NIX_PATH. E. g. <nixpkgs> is translated to /nix/var/nix/profiles/per-user/root/channels/nixos which is symlink to /nix/store/<current nixpkgs snapshot path>/nixos . Basically, Nix is configured to load <nixpkgs> first when you're using it via its package management utils.

Q: What is package name in Nix? I see that there are directories with relevant names in nixpkgs source tree, some nix tools use "nixpkgs.${program_name}" form, others use "${program_name}-${program_version}", how all these are related?
A: I think that one of things which makes it hard to understand Nix is that it has multiple entities which can be considered a "package name". They don't have to match each other, although policy requires that they do unless there's a good reason not to. Here I've tried to summarize them in a table.

Attribute name (of a derivation set*) Derivation name (in a derivation set) Package nix expression directory name
What it is Name of attribute in nix scope which contains package's derivation set String value of "name" attribute in package's derivation set Name of directory which contains package nix expression
Where it comes from Statement in some nix expression (usually pkgs/top-level/all-packages.nix) which binds derivation set produced by function from package nix expression to the attribute name Statement in package nix expression which binds the value to "name" attribute in a set which is usually passed to the stdenv.mkDerivation Path in nixpkgs source tree
How it can be used Installing via "nix-env -iA" Installing via "nix-env -i" (discouraged because search for derivation name in all derivation sets in nix scope is way more expensive than search for attribute name of derivation set) - (nixpkgs source tree hierarchy exists for maintainers' convenience, paths are only used internally as callPackage arguments in statements which bind derivation sets to attribute names)
Where it can be seen Searching via "nix search"; suggestions which Nix provides in the shell when user enters name of a program which doesn't exist in the system but is listed in binary cache index Listing via "nix-env -q"; nix store paths (part following the buildhash) nixpkgs git and /nix/store paths which contain nixpkgs snapshots; it seems that after nix expressions are loaded and namespace is populated, Nix doesn't remember anything about source file paths

*derivation set is nix expression language set which produces derivation when evaluated (nix considers set to be a derivation set if it has type="derivation" attribute)

Q: What does typical package nix expression contain?
A: Usually there's something like {set1 keys}: let ... in stdenv.mkDerivation rec {set2 entries} inside; it's a function which takes set1 and produces derivation set via stdenv.mkDerivation, by composing set2 and passing it to stdenv.mkDerivation.
let ... in is used to name some entities which are used in following expression
rec is used to "resolve dependencies" in a set, e. g. turn {x=1; y=x+1;} into {x=1; y=2;}
stdenv.mkDerivation is a "builder" function which produces a derivation set.

Q: What is a derivation? How derivation sets, derivations, /nix/store/*.drv files and paths with package contents are related?
A: Another thing which can make Nix hard to understand is protruding of this entity. If you think as a programmer about how Nix can make package nix expression into package contents, it may be obvious there should be some intermediate representation with final values of dependencies locations, etc. Derivation is its name, and .drv files are used to store derivations. Derivation contains build hashes of inputs (i. e. locations of dependencies in /nix/store) and of own outputs (i. e. locations which will be used to store package contents in /nix/store when derivation will be built). Evaluating a derivation set causes derivation to be created and stored in .drv file, as well as derivations of all dependencies (closure). After that derivation can be built, causing package contents to be produced and stored. Here I've tried to summarize Nix flow in a table:
Nix entity Location derivative produced via antiderivative found via
Package nix expression nixpkgs source tree loading -
Derivation set (accessible via attribute name) nix scope evaluating ?
Derivation /nix/store building ?
Output path (package contents) /nix/store - nix show-derivation ${path}

Q: I see package nix expression directory in nixpkgs source tree, how can I build and install the package?
A: "nix-env -i ${name}" if you can isolate derivation name value in the source, but it's not guaranteed that exactly the one you're looking at will be build because another package expression can declare same ${name}; you can't even be sure that expression you're looking at is bound to an attribute in nix scope at all. Nix is all about the scope, you should think of nixpkgs nix expressions as of a single program, its source splitted into small files and structured into tree hierarchy only for programmer's convenience; browsing the source tree is not intended way to find packages, and there seems to be no straightforward way to map source file path to attribute in the scope.

Q: Why dependencies are listed without version numbers in package nix expressions? Where are familiar >=min_ver, <=max_ver?
A: Version ranges are used in traditional package management systems only because they allow only a single version; with Nix it's possible to have any number of instances of any versions. If an app can run with the newest version of a lib, there's no reason use older version. Version-less attribute names in Nix always refer to newest versions available in a nixpkgs snapshots; if older version of a library is needed for some app, there will be special ${name}_${version} attribute for it which will be used in deps of that app; if it will be installed, it will be only known to this specific app.

Q: Why do "nix search ${program_name}" results, in addition to "nixpkgs.${program_name}", contain attributes like "nixpkgs.${something_else}.${program_name}" with same description? Why do Nix suggesstions in the shell sometimes only contain latter?
A: Nix scope is not flat, and some upper-level attributes can additionally appear nested in other sets which duplicate them for some internal purposes; "nix search" code seems to be imperfect and lists them as well as top level instances; nix suggestions code is even more imperfect. However it shouldn't matter which attribute name is used to build and install the package; (if) they all have identical derivation set, any of them will produce same derivation with same output buildhashes.

Q: What happens when I run "nix-env -iA nixpkgs.pkgname" ("nix-env --install --attr")?
A: nix loads <nixpkgs>, supposedly adding "pkgname" to the scope, builds a new "user-environment" derivation with "pkgname" added to its deps (causing "pkgname" derivation to be built and its outputs stored in /nix/store), creates a new /nix/var/nix/profiles/per-user/<username>/profile-${next_available_number} symlink pointing to its output path and points /nix/var/nix/profiles/per-user/<username>/profile symlink to that symlink.

Q: What happens when I run "nixos-rebuild switch"?
A: nix loads <nixpkgs/nixos>, evaluates "system" attribute and builds its derivation (which pulls all packages explicitly and implicitly specified in /etc/nixos/configuration.nix into the closure), creates /nix/var/nix/profiles/system-${next_available_number} symlink pointing to its output path and points /nix/var/nix/profiles/system symlink to that symlink; if bootloader is configured, Nix updates its config, adding a new entry which references kernel, initrd and init in /nix/store paths which belong to new system profile closure.

Q: What happens when I run "nix-channel --update"?
A: Nix build a new "user-environment" derivation which is dep tree root node and pulls in nixpkgs snapshots for all configured channels as its deps; symlinks pointing to the new env are created in /nix/var/nix/profiles/per-user/<username>, so that <channel_name> resolves to new snaphot path

Confusion: "user-environment" store path can be both for "channels" env (pulling in nixpkgs snapshots) and "profile" env (pulling in installed pkgs)

Q: How all those programs installed in /nix/store are available to user without requiring user to use paths with buildhashes?
A: PATH contains /home/<username>/.nix-profile/bin (which is in "user-environment" derivation output path and contains collected symlinks to programs in bin/ subdirs of all deps) and /run/current-system/sw/bin (which is in "system" derivation output path, ...)

четверг, 15 марта 2018 г.

ШОК СЕНСАЦИЯ НЕУСТРАНИМАЯ УЯЗВИМОСТЬ В ЛЮБОМ ПРОЦЕССОРЕ! ДАННЫЕ ИЗ ЛЮБОЙ ЗАЩИЩЕННОЙ ОБЛАСТИ ПАМЯТИ МОЖНО ПОЛУЧИТЬ, ЕСЛИ ДОСТАТОЧНО ДОЛГО ВЫЧИСЛЯТЬ ЗНАКИ ЧИСЛА ПИ!

среда, 16 марта 2016 г.

Templates shouldn't be a mess

I was working on another Django project when I decided to write down my thoughts on readable template code formatting.

I believe that templates are more readable when they're written with the idea that a clean, correctly indented hypertext should be produced.
Django documentation states that there are template variables and template tags in the template language. In my formatting explanation, I call both 'tags' and divide them into text-producing and non-text-producing ones.
Django documentation samples and lots of coders mix up indentation of hypertext and template tags. I use independent parallel indentation for non-text-producing template tags which are placed on separate lines out of hypertext, and I separate such tags from surrounding hypertext with empty lines.

In detail:
  • hypertext is written with common, global indentation hierarchy
  • text-producing template tags (variable substitutions, includes, etc.) are placed like the hypertext which they produce:
    • tags which produce inline hypertext fragments are placed inside the hypertext lines
    • tags which produce (multi)line hypertext fragments are placed on separate lines, indented like the hypertext which they produce, following the global hypertext indentation
  • non-text-producing, or logic-controlling template tags, which generate no text themselves, but define the logic of template processing (conditional operators, loops, variable assignments, etc.), are another thing:
    • tags which control inline hypertext fragments (e. g. conditional block which controls a hypertext tag attribute) are placed within these lines
    • tags which control (multi)line hypertext fragments or no hypertext at all are placed on separate lines, following their own global indentation hierarchy, independent from the one of the hypertext; such template tag lines are separated by empty lines from the hypertext and hypertext-producing template tags around them
Example:

<html>
  <head>
  </head>
  <body>
    {% include 'header.htm' %}

{% if list %}

    <ul>

  {% for item in list %}

      <li>{{ item }}</li>

  {% endfor %}

    </ul>

{% endif %}

  </body>
</html>

воскресенье, 21 июня 2015 г.

Notes about installing Debian/Ubuntu from other Linux distro with debootstrap

Existing guides are either incomplete, inaccurate or assume that the base system is Debian or Ubuntu: https://help.ubuntu.com/community/Installation/FromLinux#Without_CD

If you want to set it up manually, you can download the latest .tar.gz package from http://ftp.debian.org/debian/pool/main/d/debootstrap/ and run debootstrap from the source tree with DEBOOTSTRAP_DIR env variable. Note that it requires devices.tar.gz to be present in the same directory; you can either build it by running
make devices.tar.gz
or extract prebuilt one from debootstrap .deb package. Besides, it can require explicit arch and mirror URL in commandline, e. g.:
DEBOOTSTRAP_DIR=. ./debootstrap --arch=i386 vivid /mnt/installroot https://mirrors.kernel.org/ubuntu
Without the explicit URL it can attempt to get Release file from https://mirrors.kernel.org/debian even if an Ubuntu release (e. g. vivid) is specified.
All external dependencies are usually present in any Linux system, with the possible exception for Perl, which I installed in my live Slax with no effort (you may also find the previous blogpost about booting Slax from the Windows partition without using removeable media useful).

If everything has been set up properly, debootstrap should succeed installing the base system and you can proceed with mounting, chrooting, basic configuration (remember to configure apt sources) and installing packages. You can find names of Ubuntu metapackages for installing the whole system at https://help.ubuntu.com/community/MetaPackages (although it misses several variants like lubuntu-core).

суббота, 20 июня 2015 г.

If you REALLY need to boot Linux on a Windows machine and don't have any bootable media


You can boot Slax Linux from Windows NTFS partition using GRUB4DOS, not even touching the Windows bootsector. This HOWTO assumes that machine has a typical Windows 7 installation.

1. Download GRUB4DOS, extract grldr.mbr and grldr to your C:\

2. Use bcdedit commandline tool to add GRUB4DOS to bootmgr:
> BCDEDIT.EXE /store C:\boot\BCD /create /d "Start GRUB4DOS" /application bootsector
< {guid}
> BCDEDIT.EXE /store C:\boot\BCD /set {guid} device boot
> BCDEDIT.EXE /store C:\boot\BCD /set {guid} path \grldr.mbr
> BCDEDIT.EXE /store C:\boot\BCD /displayorder {guid} /addlast

See http://diddy.boot-land.net/grub4dos/files/install_windows.htm for more details.

3. Download Slax and extract slax directory to C:\

4. Create C:\menu.lst for GRUB4DOS menu and add menu item for booting Slax:
title slax
kernel /slax/boot/vmlinuz vga=normal load_ramdisk=1 prompt_ramdisk=0 rw printk.time=0 slax.flags=xmode,toram
initrd /slax/boot/initrfs.img

If you wonder where these boot parameters are from, Slax has a syslinux boot config C:\slax\boot\syslinux.cfg which defines a complex boot menu with toggleable options. There are several 'MENU BEGIN xxxxx' blocks, where first 4 menuname characters are 0 or 1, 1st one representing 'Persistent changes' option state, 2nd for 'Graphical desktop', 3rd for 'Copy to RAM' and 4th for 'Act as PXE server'. 5th character is 0...3, meaning nothing but a highlighted menu item number. I've chosen a variant with graphical desktop and copy-to-ram, so I'm using boot parameters from 'MENU BEGIN 01100' block:
KERNEL /slax/boot/vmlinuz
APPEND vga=normal initrd=/slax/boot/initrfs.img load_ramdisk=1 prompt_ramdisk=0 rw printk.time=0 slax.flags=xmode,toram
Note that grub syntax differs from syslinux one (inline kernel arguments, separate initrd line instead of kernel pseudo-argument).

Now you can choose 'Start GRUB4DOS' in Windows boot menu and 'slax' in GRUB4DOS menu to boot Slax.

воскресенье, 7 июня 2015 г.

Using Selenium+PhantomJS+Browsermob-proxy for AJAX scraping

Recently I needed to write a Python script which obtains data from AJAX traffic of a website with a very good anti-robot protection. Selenium with some browser (I prefer headless PhantomJS) is usually good for such tasks, but in this case it was not enough: I needed raw AJAX data, not webpage contents after its JS processing, and server did not accept direct requests, even with cookies which I've got with Selenium. I tried to get traffic dump from the browser; PhantomJS can generate HAR dump, but it turned out that it still misses support for capturing contents. Next idea was to use a capturing proxy; Browsermob-proxy came up as a good choice. It supports HAR, too, and can be easily controlled from Python script with this module, just like a browser with Selenium.
 
Here is the code example:

# Start proxy
from browsermobproxy import Server
server = Server('/path/to/browsermob-proxy')
server.start()
proxy = server.create_proxy()

# Start browser
import selenium.webdriver
browser = selenium.webdriver.PhantomJS('/path/to/phantomjs', service_args=['--proxy={0}'.format(proxy.proxy), '--ignore-ssl-errors=true'])

# Tell browser to open webapp page
browser.get('http://web.app/page/url')

# Tell proxy to start capture
proxy.new_har(options={'captureHeaders':True, 'captureContent':True})

# Tell browser to perform some actions which should cause AJAX requests
browser.find_element_by_id('some-input-id').send_keys('some input')

# Wait for end of transmission
# Ideally, this should be implemented as wating for some browser event
from time import sleep
sleep(10)

# Process results
for entry in proxy.har['log']['entries']:
    if entry['request']['url'] == 'http://web.app/ajax/endpoint/url':
        print(entry['response']['content']['text'])

# Shutdown
browser.quit()
server.stop()