Home > Enterprise >  How do I make this Guid-to-hash function more concise?
How do I make this Guid-to-hash function more concise?

Time:01-04

I came up with a Guid-to-hash function in F# as shown below, but I think it is too verbose. How can I make it more concise?

  let digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
  let nbase = bigint digits.Length
  let zero = bigint.Zero

  let hash (g: System.Guid) =
    (g.ToByteArray(), [| 0x00uy |])
    ||> Array.append
    |> bigint
    |> Array.unfold (fun d ->
      if d = zero then
        None
      else
        let (n, r) = bigint.DivRem(d, nbase)
        Some (r, n))
    |> Array.rev
    |> Array.skipWhile ((=) zero)
    |> Array.map (fun b -> digits.[int b])
    |> System.String

CodePudding user response:

There's a little bit of cleanup you can do like so:

let hash (g: System.Guid) =
    [| yield! g.ToByteArray(); 0x00uy |]
    |> bigint
    |> Array.unfold (fun d ->
        if d = zero then
            None
        else
            let (n, r) = bigint.DivRem(d, nbase)
            Some (r, n))
    |> Array.choose (fun b -> if b = zero then None else Some digits[int b])
    |> System.String

But in general I'm not sure how to make this more succinct based on how I understand you want your hash function to operate.

That being said, if I wanted to hash a guid I'd just use the built-in hash function.

CodePudding user response:

I wouldn't call this a hash function, as its output is usually fixed length and does not uniquely identify keys (but still has good uniformity over the key space). It's more like a conversion of the data type nigint to and from other data types, System.Guid and string (with an encoding added).

For conciseness, it appears that recursion with a list accumulator is quite suitable, as it does also the reversal to get the most significant character into the first position.

module Bigint =
    let toString alphabet input =
        let nbase = bigint(String.length alphabet)
        let rec aux acc x =
            if x = 0I then acc
            else aux (alphabet.[int(x % nbase)]::acc) (x / nbase)
        System.String(Array.ofList(aux [] input))

    let fromString alphabet (input : string) =
        let nbase = bigint(String.length alphabet)
        let d = alphabet |> Seq.mapi (fun i c -> c, bigint i) |> dict
        input.ToCharArray()
        |> Array.rev
        |> Array.mapi (fun i c -> pown nbase i * d.[c])
        |> Array.sum

    let fromGuid (g : System.Guid) =
        bigint[| yield! g.ToByteArray(); yield 0uy |]

    let toGuid (bi : bigint) =
        let b = bi.ToByteArray()
        if Array.length b > 16 then b.[..15]
        else Array.append (Array.zeroCreate (16 - Array.length b)) b
        |> fun a -> System.Guid a

The output is variable length, the zero Guid will give you an empty string. Testing:

let alphabet2 = "01"
Bigint.toString alphabet2 (bigint 0x181) // 385
// val it : System.String = "110000001"
|> Bigint.fromString alphabet2
// val it : System.Numerics.BigInteger = 385

let alphabet16 =
    System.String(Array.concat[|[|'0'..'9'|];[|'A'..'F'|]|])
Bigint.toString alphabet16 (bigint 0x181) // 385
// val it : System.String = "181"
|> Bigint.fromString alphabet16
// val it : System.Numerics.BigInteger = 385

let alphabet62 =
    System.String(Array.concat[[|'0'..'9'|];[|'A'..'Z'|];[|'a'..'z'|]])
let guid0 = System.Guid()
// val guid : System.Guid = 00000000-0000-0000-0000-000000000000
Bigint.fromGuid guid0
|> Bigint.toString alphabet62
// val it : System.String = ""
|> Bigint.fromString alphabet62
// val it : System.Numerics.BigInteger = 0
|> Bigint.toGuid
// val guid : System.Guid = 00000000-0000-0000-0000-000000000000
let guid1 = System.Guid.NewGuid()
// val guid1 : System.Guid = bf9f89db-e307-4dcb-b734-a4bbb61a8365
Bigint.fromGuid guid1
|> Bigint.toString alphabet62
// val it : System.String = "35Y8fherqxCBOydJ1VN9UR"
|> Bigint.fromString alphabet62
// val it : System.Numerics.BigInteger =
//   134932760282992047737587898291171854811
|> Bigint.toGuid
// val it : System.Guid = bf9f89db-e307-4dcb-b734-a4bbb61a8365

If you want the conversion with a fixed length, Base64 encoding might be an alternative:

System.Convert.ToBase64String(guid1.ToByteArray()).[..21]
// val it : string = "24mfvwfjy023NKS7thqDZQ"
|> fun s -> System.Guid(System.Convert.FromBase64String(s   "=="))
// val it : System.Guid = bf9f89db-e307-4dcb-b734-a4bbb61a8365
  •  Tags:  
  • Related