Finding the size of directories in the shell

I recently found myself wanting to know the size of each folder in my current working directory in the terminal. I started messing around with piping the results of find in to xargs and then trying to get the size with du. Unfortunately, keeping track of the name of the directory throughout the process, and then formatting the file size so it looked good and having the directories then sorted how I wanted them, by size, descending, turned out to be a pain. Usually around the time I start reading the man pages for awk, it’s time to ditch bash and use a real programming language. 😃

This is what I came up with. It’s a PHP script that runs in the terminal and accepts arguments for the order in which the folder sizes are sorted, whether to sort by size or by file name, and the number of digits of precision to format the file sizes with, and then prints out a nicely formatted report with the appropriate human-readable formats for file sizes. I keep it in my path so I can call it with dir_sizes. So far it’s working really well and helped me to accomplish in the shell what I probably would have used Disk Usage Analyzer or similar for when using a GUI. It runs each du process in parallel using popen, which I found to be roughly 3x faster than running each check sequentially on my quad-core laptop.

By the way, if you ever find yourself wishing that C++ had a REPL and go to compile Cling, an interactive C++ interpreter, keep in mind that it will generate 30+ GB worth of files. 😲

 

#! /usr/bin/php
<?php

function format_bytes($size, $precision){
    $units = ['B', 'KB', 'MB', 'GB', 'TB'];
    $units_len = count($units) – 1;

    for ($i = 0; $size >= 1024 && $i < $units_len; ++$i)
        $size /= 1024;

    return sprintf("%.{$precision}f %s", $size, $units[$i]);
}

function parse_args($args, $argc, $argv){
    $parsed = [];

    if($argc < 3)
        return array_reduce($args, function($args, $arg){
            $args[$arg['long']] = $arg['default'];
            return $args;
        }, []);

    foreach ($args as $arg)
        for ($i = 0; $i < $argc; ++$i) {
            if (
                ($argv[$i] === '-' . $arg['short'] || $argv[$i] === '--' . $arg['long']) &&
                !empty($argv[$i + 1]) &&
                $arg['valid']($argv[$i + 1])
            ){
                $parsed[$arg['long']] = $argv[$i + 1];
                break;
            }
            else
                $parsed[$arg['long']] = $arg['default'];
        }

    return $parsed;
}

$args = parse_args([
    [
        'short' => 's',
        'long' => 'sort-by',
        'valid' => function($arg){
            return in_array($arg, [
                'size',
                'name'
            ]);
        },
        'default' => 'size'
    ],
    [
        'short' => 'p',
        'long' => 'precision',
        'valid' => function($arg){
            return intval($arg) >= 0;
        },
        'default' => 2
    ],
    [
        'short' => 'o',
        'long' => 'order',
        'valid' => function($arg){
            return in_array($arg, [
                'asc',
                'desc'
            ]);
        },
        'default' => 'desc'
    ]
], $argc, $argv);

exec("find " . getcwd() . " -mindepth 1 -maxdepth 1 -type d", $dirs);

$longest_dir = $longest_size = 0;

$dir_sizes = array_map(
    function ($dir, $size_info) use (&$longest_dir, &$longest_size, $args) {
        $size = intval(explode(" ", $size_info)[0]) ?: 0;
        $formatted_size = format_bytes($size, $args['precision']);

        $dir_len = strlen($dir);
        $size_len = strlen($formatted_size);

        if ($dir_len > $longest_dir)
            $longest_dir = $dir_len;

        if ($size_len > $longest_size)
            $longest_size = $size_len;

        return [
            'name' => $dir,
            'size' => $size,
            'formatted_size' => $formatted_size
        ];
    },
    $dirs,
    array_map(function ($stream) {
        return stream_get_contents($stream);
    }, array_map(function ($dir) {
            return popen("du -sb " . escapeshellarg($dir), 'r');
        }, $dirs))
);

usort($dir_sizes, function ($a, $b) use ($args) {
    if ($a === $b)
        return 0;

    $asc = $args['order'] === 'asc';

    if($a[$args['sort-by']] < $b[$args['sort-by']])
        return $asc ? -1 : 1;
    else
        return $asc ? 1 : -1;
});

foreach ($dir_sizes as $dir)
    printf("%-{$longest_dir}s  %{$longest_size}s\n", $dir['name'], $dir['formatted_size']);

Outputs:

ryan@ryan-XPS-13-9343:~/dev/elixir-blog$ dir_sizes 
/home/ryan/dev/elixir-blog/react-app     98.77 MB
/home/ryan/dev/elixir-blog/phoenix_app    9.68 MB
/home/ryan/dev/elixir-blog/.git         521.87 KB
/home/ryan/dev/elixir-blog/.idea         39.32 KB

 

Partition an Array in Javascript

At times it’s necessary to partition an array in to equal size chunks. This is a simple JS one-liner that makes it really easy to do.

 

Array(Math.ceil(a.length / size)).fill(0).map((_,i) => a.slice(i * size, (i+1) * size));

 

Example Usage:

 

const a = [1,2,3,4,5,6,7,8,9];

function partition(a,size){
    return Array(Math.ceil(a.length / size)).fill(0).map((_,i) => a.slice(i * size, (i+1) * size));
}

Array(9).fill(0).map((_,i) => console.log(partition(a,i+1)));

/*
[ [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ], [ 8 ], [ 9 ] ]
[ [ 1, 2 ], [ 3, 4 ], [ 5, 6 ], [ 7, 8 ], [ 9 ] ]
[ [ 1, 2, 3 ], [ 4, 5, 6 ], [ 7, 8, 9 ] ]
[ [ 1, 2, 3, 4 ], [ 5, 6, 7, 8 ], [ 9 ] ]
[ [ 1, 2, 3, 4, 5 ], [ 6, 7, 8, 9 ] ]
[ [ 1, 2, 3, 4, 5, 6 ], [ 7, 8, 9 ] ]
[ [ 1, 2, 3, 4, 5, 6, 7 ], [ 8, 9 ] ]
[ [ 1, 2, 3, 4, 5, 6, 7, 8 ], [ 9 ] ]
[ [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ] ]

*/