• tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    19
    ·
    edit-2
    15 hours ago

    Assuming that the leading number and period is part of the filename:

    1. If the directory-browsing thing supports natural sorting of numbers, you can use that; this will detect numbers within the filename and sort by them. For example, I use emacs’s dired to browse directories and play movies. It can modify the flags passed to ls, and ls supports natural sorting of numbers in filenames with the -v flag. Doing this in dired means C-u s v RET, and the directory will be displayed sorted numerically.

    2. If the directory-browsing thing doesn’t support that, rename to a “0”-padded format such that a lexicographic order has the same ordering as numeric order:

       $ for f in *; do
           n="$(printf %04d ${f%%.*})"
           mv "$f" "$(echo $f|sed s/^[0-9]*/$n/)"
       done 
      

      That way,

       1. Episode_One.mp4
      

      will become

       0001. Episode_One.mp4
      

      and so forth, and so software that can only do lexicographic orderings is happy.

    If the leading number and period isn’t part of the filename, then we need to parse human-language strings.

    $ npm install words-to-numbers
    
    $ for f in *; do
        w="$(echo $f|sed -r "s/^Episode_([^.]*).mp4/\1/")"
        n=$(echo $w|node -e 'wn=require("words-to-numbers"); const l=require("readline").createInterface({input:process.stdin, output:process.stdout, terminal:false}); rl.on("line", l=> {console.log(wn.wordsToNumbers(l));})')
        nf=$(printf %04d $n)
        mv "$f" "$nf. $f"
    done
    

    That’ll rename to the format mentioned above in option 2:

    Episode_One.mp4
    

    will become

    0001. Episode_One.mp4
    

    And lexicographic sorting will work.

    • hemko@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      20
      ·
      18 hours ago

      This is why I love Lemmy. There’s a meme post and then someone writes a 7 page long technical solution like they were paid contributor at stack overflow.

      Never change <3

      • tal@lemmy.today
        link
        fedilink
        English
        arrow-up
        5
        ·
        edit-2
        17 hours ago

        The first case is the most-likely for most people, and the simplest to do. Most directory browsers do support numeric sorting.

        In the second case, I provided a Perl program in another comment in the thread that provides a generic way to do this with one command for nearly all files of this sort.

        The third case, where human-language stuff needs to be parsed, true enough, doesn’t just have a button to push.

  • jordanlund@lemmy.world
    link
    fedilink
    arrow-up
    26
    ·
    19 hours ago

    It’s sorting correctly.

    11 comes after 1.

    The problem is the data needs to be fixed. 01, 02, 03…

    • logicbomb@lemmy.world
      link
      fedilink
      arrow-up
      10
      ·
      18 hours ago

      It’s not sorting correctly. Right above the listing it specifically says “(sorting by #)”.

      If it says it is sorting by number, but then it sorts alphabetically, then it isn’t sorting correctly.

      • jordanlund@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        16 hours ago

        Problem is it can’t sort by number of there are text values in file name. “11” is a number. “11.” is not. :) Fun with computers!

        • logicbomb@lemmy.world
          link
          fedilink
          arrow-up
          9
          ·
          14 hours ago

          Both “11” and “11.” are strings, because the context is listing filenames. Filenames are not numbers. They are strings. If you sort filenames by number, you are asking the computer to interpret the string as having a number inside it. At worst, it might interpret “11” as an integer and “11.” as a floating point number, because that syntax is often used to specify a floating point number in programming. But even then, it could still sort them correctly.

          I don’t mean to start an argument, but as a professional programmer, there are just some things that I know.

          • Fushuan [he/him]@lemm.ee
            link
            fedilink
            arrow-up
            4
            ·
            14 hours ago

            Are you intentionally ignoring that the actual names of the files are “11. EpisodeEleven.mp3”? There’s whitespaces and a bunch of letters there.

            I’m also a professional programmer, and assuming that sorting by numbers code would try to grab the first block of strings until the first whitespace is a big assumption I would not make. I’d say that after trying to convert everything but the extension to a number for sorting it failed so it defaulted to string sorting for everything else.

            • logicbomb@lemmy.world
              link
              fedilink
              arrow-up
              1
              ·
              6 hours ago

              You explaining that you would only do a numerical sort when the basename of the filename is entirely a number, as if your logical sense translates into a good user experience, is exactly why companies have to hire UX designers instead of just programmers.

              If you have Windows 10, I suggest that you play around with filenames with numbers in file explorer and see how they’re sorted. Your intuition does not match the average user’s expectations in this circumstance.

              • Fushuan [he/him]@lemm.ee
                link
                fedilink
                arrow-up
                1
                ·
                6 hours ago

                I know it doesn’t, I was countering your logic of “I’m a professional programmer” as if the correct interaction would be obvious to a programmer. The intended interaction requires extra thought and to be more through than the obvious one, as you have described.

    • Wren@lemmy.world
      link
      fedilink
      arrow-up
      6
      ·
      edit-2
      19 hours ago

      Fair point, but how does it reconcile that 11 and 12 comes before 2?

      (Admittedly, I know nothing at all about this stuff. I’m a music major and I went all in on that, so I’m genuinely curious)

        • Wren@lemmy.world
          link
          fedilink
          arrow-up
          5
          ·
          19 hours ago

          So that’s the joke then? That someone chose to alphabetically sort numbers?

            • Wren@lemmy.world
              link
              fedilink
              arrow-up
              1
              ·
              edit-2
              6 hours ago

              Me? Not at all. Unless you meant the comic strip thing, then yeah. I suppose I could see that.

              Being bad at a lot of things doesn’t bother me a bit. I’m good at other things. And some of those things, I’m even great at!

              • Klear@sh.itjust.works
                link
                fedilink
                arrow-up
                1
                ·
                2 hours ago

                No, I did not mean you.

                I found it especially overblown since I ran into a very similar issue recently. I actually laughed at myself when I saw the mangled list and fixed the names (it helped that the list was something I generated with a script and adding leading zeroes was a simple matter).

          • knightly the Sneptaur@pawb.social
            link
            fedilink
            arrow-up
            5
            ·
            edit-2
            16 hours ago

            Basic sorting is always like this, the joke is that way too many people still number things badly. Alphanumerically sorting variable-length numbers without normalizing the number of digits will always result in situations like 02 < 1 < 109 < 11 < 2

            • Wren@lemmy.world
              link
              fedilink
              arrow-up
              4
              ·
              18 hours ago

              I’m assuming that normalizing digits is similar to normalizing audio? Where you take the numbers and assigning them all a blanket set of instructions?

              Normalization in mixing audio essentially takes a track and adjusts the volume of the track’s peaks to keep them from clipping the volume threshold.

              Is it like this?

              • knightly the Sneptaur@pawb.social
                link
                fedilink
                arrow-up
                3
                ·
                edit-2
                16 hours ago

                Yeah! In general, Normalization refers to adjusting values measured on different scales to a common scale.

                Consider 1 and 10. The value of the first digit of both numbers is 1, so a scale-invariant numerical sort sees both numbers as coming before 2.

                Normalizing both numbers to a two digit scale gives us 01 and 10, which sort as expected with 02.

                • Wren@lemmy.world
                  link
                  fedilink
                  arrow-up
                  2
                  ·
                  18 hours ago

                  Ahhh. So you have automated processes that will handle it as well. Just plug in the ranges and it does its thing.

          • Buffalox@lemmy.world
            link
            fedilink
            arrow-up
            5
            ·
            edit-2
            18 hours ago

            Not just someone. It’s the default when numbers are in text strings, they are treated as text, not as numerical values.
            To account for numbers in text strings in any text listing system, requires quite a bit of extra work when programming it.
            So the joke is that computers are pretty dumb in this regard, and they need a lot of help to do it right.

      • jordanlund@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        18 hours ago

        Filenames are text, not numbers, so 11 comes before 2. So does 111, 101, etc. Anything starting with 1 is before 2.

        In order to sort “properly”, you have to name your files properly.

        00
        01
        02
        up to…
        99

        If you have more than 100 files in the same folder, the you have to go back through and re-name them:

        000
        001
        002
        … 099
        100

          • Buffalox@lemmy.world
            link
            fedilink
            arrow-up
            5
            ·
            18 hours ago

            Oh no this isn’t it, this is extremely basic. As a programmer you will encounter way way more complex logic problems, so logic problems become 2nd nature. And it can be frustrating when “normies” don’t understand what appear to a programmer to be a matter of pretty simple logic thinking.

            • Wren@lemmy.world
              link
              fedilink
              arrow-up
              3
              ·
              18 hours ago

              As a “normie” I’d never assume to understand even the most basic of programming. My head hurts to even try and come up with a joke involving the subject.

              • Buffalox@lemmy.world
                link
                fedilink
                arrow-up
                4
                ·
                18 hours ago

                It’s so nice we have people like you, that allows us to feel superior, even if we are inferior in every other way imaginable. 👍

            • Wren@lemmy.world
              link
              fedilink
              arrow-up
              2
              ·
              18 hours ago

              Ahh. Makes sense that it can be automated. Same as with audio engineering. Though I like to normalize by hand sometimes as automated normalization tends to make a track sound lifeless and dead.

            • tal@lemmy.today
              link
              fedilink
              English
              arrow-up
              1
              ·
              17 hours ago

              A Perl program to convert the number of digits in the first numeric field that appears in a list of filenames.

              source
              #!/usr/bin/perl -w
              # numberit.pl
              # Converts between number formats (number of leading zeros) in numbers in title names
              # Usage: <number of digits> filelist
              
              $digits = shift (@ARGV);
              
              if ($digits > -1)
              {
                  foreach $oldName (@ARGV)
                  {
                      $newName = $digitValue = $oldName;
              
                      if ($digitValue =~ m/\//) {
                        $digitValue =~ m/^(.*\/[^0-9\/]*)([0-9]+)([^\/]*)$/;
                        $prefix = $1;
                        $postfix = $3;
                        if (!defined($prefix)) {
                          $prefix = "";
                        }
              
                        $digitFormatted = sprintf("%0${digits}d", $2);
              
                      } else {
                        $digitValue =~ m/^([^0-9]*)([0-9]+)([^\/]*)$/;
                        $prefix = $1;
                        $postfix = $3;
                        if (!defined($prefix)) {
                          $prefix = "";
                        }
              
                        $digitFormatted = sprintf("%0${digits}d", $2);
              
              
                      }
              
                      if ($digitValue) {
                        $newName = $prefix . $digitFormatted . $postfix;
                        rename($oldName, $newName);
                      }
                    }
              }
              

              Looks something like:

              $ touch a1.mp3
              $ touch a2.mp3
              $ numberit.pl 3 *
              $ ls
              a001.mp3  a002.mp3
              $ numberit.pl 2 *
              $ ls
              a01.mp3  a02.mp3
              $
              
      • Skullgrid@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        18 hours ago

        Fair point, but how does it reconcile that 11 and 12 comes before 2?

        it doesn’t, it reconciles that 02 comes before 11

      • konalt@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        19 hours ago

        It’s sorted alphabetically. All the numbers that start with “1” come first, then all the numbers that start with “2”, regardless of how long the actual number is.

  • rumschlumpel@feddit.org
    link
    fedilink
    arrow-up
    27
    ·
    19 hours ago

    That’s just alphabetical sorting. There’s other sorting styles that would put 11 after 2, but those aren’t available everywhere.

    • Sculptus Poe@lemmy.world
      link
      fedilink
      arrow-up
      6
      ·
      18 hours ago

      You are correct, she should have used leading '0’s. Really, they could have fixed that sort. Excel does it without leading '0’s. Excel also does a lot of other nonsense because it is trying too hard, maybe I prefer the dumb sort. I try to get my people to start dated files with YYYYMMDDHHMM just because the file sorting is sort of dumb, but they insist on MMDDYYYY so all the years get jumbled together by month…

      • knightly the Sneptaur@pawb.social
        link
        fedilink
        arrow-up
        6
        ·
        edit-2
        18 hours ago

        This is 100% my pet peeve. Obviously Y-M-D-H-M is the correct format for timestamps because an alphanumeric sort will put them in chronological order.