EasyUDF - Manual
Functions
BBAN
- Dichiarazione
-
DECLARE EXTERNAL FUNCTION BBAN
CSTRING(8) NULL,
CSTRING(8) NULL,
CSTRING(32) NULL,
CSTRING(32)
RETURNS PARAMETER 4
ENTRY_POINT 'fn_bban' MODULE_NAME 'EasyUDF'; - Esempi
-
BBAN('06055','00000','000000000000') = 'I0605500000000000000000'
BBAN('06055',null,null) = '' - Descrizione
-
Il BBAN è la coordinata bancaria nazionale che
consente di identificare, in maniera standard ed univoca, il conto corrente del beneficiario permettendo
all'ordinante o alla banca di quest'ultimo di verificarne la correttezza grazie alla presenza del
carattere di controllo. La struttura del BBAN è fissa ed è stata dettata, per il nostro paese, da
standard dell'ABI; in particolare la
lunghezza di 23 caratteri è fissa ed è così composta:
Il BBAN non può contenere caratteri speciali, deve essere composto da numeri e lettere maiuscole; il numero conto effettivo deve essere allineato a destra e riempito, eventualmente, di zeri a sinistra fino all'effettivo riempimento del campo. Il primo carattere del BBAN (CIN) ha la funzione di carattere di controllo dell'esatta trascrizione dei successivi 22 caratteri. Il CIN deve essere un carattere alfabetico, l'ABI e il CAB numerici. Dal 16 giugno 2003 l'esecuzione di bonifici Italia con coordinate bancarie incomplete o inesatte è assoggettata ad una penale a carico dell'ordinante del bonifico (per il momento il carattere di controllo CIN rimane un dato facoltativo non assoggettato a penale). Il codice BBAN è riportato su ogni estratto conto e su ogni contabile di conto corrente.
CIN ABI CAB CONTO codice di controllo 12 caratteri (UNA LETTERA) (CINQUE CIFRE) (CINQUE CIFRE) (MAIUSCOLE E NUMERI) X 06055 00000 000000000000 - Links
- Wikipedia - Coordinate bancarie
CFCHECK
- Dichiarazione
-
DECLARE EXTERNAL FUNCTION CFCHECK
CSTRING(32) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_cf_check' MODULE_NAME 'EasyUDF'; - Esempi
-
CFCHECK('RSSBBR69C48F839A') > 0
CFCHECK('PPPPLT80R10M082K') > 0
CFCHECK('PPPPLT80R10M082Z') = 0
CFCHECK(null) = 0 - Descrizione
-
Il numero di codice fiscale delle persone fisiche è costituito da un'espressione alfanumerica di sedici caratteri.
I primi quindici caratteri sono indicativi dei dati anagrafici di ciascun soggetto secondo l'ordine seguente:
- tre caratteri alfabetici per il cognome;
- tre caratteri alfabetici per il nome;
- due caratteri numerici per l'anno di nascita;
- un carattere alfabetico per il mese di nascita;
- due caratteri numerici per il giorno di nascita ed il sesso;
- quattro caratteri, di cui uno alfabetico e tre numerici per il comune italiano o per lo Stato estero di nascita.
- Algoritmo
-
- Caratteri indicativi del cognome
-
I cognomi che risultano composti da più parti o comunque separati od interrotti, vengono
considerati come se fossero scritti secondo un'unica ed ininterrotta successione di caratteri.
- Per i soggetti coniugati di sesso femminile si prende in considerazione soltanto il cognome da nubile.
- Se il cognome contiene tre o più consonanti, i tre caratteri da rilevare sono, nell'ordine, la prima, la seconda e la terza consonante.
- Se il cognome contiene due consonanti, i tre caratteri da rilevare sono, nell'ordine, la prima e la seconda consonante e la prima vocale.
- Se il cognome contiene una consonante e due vocali, si rilevano, nell'ordine, quella consonante e quindi la prima e la seconda vocale.
- Se il cognome contiene una consonante e una vocale, si rilevano la consonante e la vocale, nell'ordine, e si assume come terzo carattere la lettera x.
- Se il cognome è costituito da due sole vocali, esse si rilevano, nell'ordine, e si assume come terzo carattere la lettera x.
- Caratteri indicativi del nome
-
I nomi doppi, multipli o comunque composti, vengono considerati come scritti per esteso in ogni loro parte e secondo un'unica ed ininterrotta successione di caratteri.
- Se il nome contiene quattro o più consonanti, i tre caratteri da rilevare sono, nell'ordine, la prima, la terza e la quarta consonante.
- Se il nome contiene tre consonanti, i tre caratteri da rilevare sono, nell'ordine, la prima, la seconda e la terza consonante.
- Se il nome contiene due consonanti, i tre caratteri da rilevare sono, nell'ordine, la prima e la seconda consonante e la prima vocale.
- Se il nome contiene una consonante e due vocali, i tre caratteri da rilevare sono, nell'ordine, quella consonante e quindi la prima e la seconda vocale.
- Se il nome contiene una consonante e una vocale, si rilevano la consonante e la vocale, nell'ordine, e si assume come terzo carattere la lettera x.
- Se il nome è costituito da due sole vocali, esse si rilevano nell'ordine, e si assume come terzo carattere la lettera x.
- Data, sesso e luogo di nascita
-
I due caratteri numerici indicativi dell'anno di nascita sono, nell'ordine, la cifra delle decine e la cifra delle unità dell'anno stesso.
Il carattere alfabetico corrispondente al mese di nascita è quello stabilito per ciascun mese nella seguente tabella:
I due caratteri numerici indicativi del giorno di nascita e del sesso vengono determinati nel modo seguente:
Lettera Mese Lettera Mese Lettera Mese A gennaio E maggio P settembre B febbraio H giugno R ottobre C marzo L luglio S novembre D aprile M agosto T dicembre - per i soggetti maschili il giorno di nascita figura invariato, con i numeri da uno a trentuno, facendo precedere dalla cifra zero i giorni del mese dall'uno al nove;
- per i soggetti femminili il giorno di nascita viene aumentato di quaranta unità, per cui esso figura con i numeri da quarantuno a settantuno.
- Persone fisiche con identica espressione alfanumerica (omocodia)
-
Quando due o più soggetti hanno dati anagrafici che generano lo stesso codice fiscale (omocodici), per ciascuno di essi si provvede a differenziare il codice fiscale. A tal fine, si effettuano, nell'ambito dei sette caratteri numerici
contenuti nel codice, sistematiche sostituzioni di una o più cifre a partire da quella di destra, con corrispondenti caratteri alfabetici secondo la seguente tabella:
Numero Lettera Numero Lettera Numero Lettera Numero Lettera Numero Lettera 0 L 2 N 4 Q 6 S 8 U 1 M 3 P 5 R 7 T 9 V - Carattere alfabetico di controllo
-
Il sedicesimo carattere ha funzione di controllo dell'esatta trascrizione dei primi quindici caratteri e viene
determinato in questo modo:
-
ciascuno degli anzidetti quindici caratteri, a seconda che occupi posizione di ordine pari o posizione di
ordine dispari, viene convertito in un valore numerico, in base alle tabelle di corrispondenza successivamente
riportate. Per i sette caratteri con posizione di ordine pari:
Per gli otto caratteri con posizione di ordine dispari:
Carattere Valore Carattere Valore Carattere Valore Carattere Valore 0 1 9 21 I 19 R 8 1 0 A 1 J 21 S 12 2 5 B 0 K 2 T 14 3 7 C 5 L 4 U 16 4 9 D 7 M 18 V 10 5 13 E 9 N 20 W 22 6 15 F 13 O 11 X 25 7 17 G 15 P 3 Y 24 8 19 H 17 Q 6 Z 23 Carattere Valore Carattere Valore Carattere Valore Carattere Valore 0 0 9 9 I 8 R 17 1 1 A 0 J 9 S 18 2 2 B 1 K 10 T 19 3 3 C 2 L 11 U 20 4 4 D 3 M 12 V 21 5 5 E 4 N 13 W 22 6 6 F 5 O 14 X 23 7 7 G 6 P 15 Y 24 8 8 H 7 Q 16 Z 25 -
I valori numerici così determinati vengono addizionati e la somma si divide per il numero 26.
Il carattere di controllo si ottiene convertendo il resto di tale divisione nel carattere alfabetico ad
esso corrispondente nella sotto indicata tabella:
Resto Lettera Resto Lettera Resto Lettera Resto Lettera 0 A 7 H 14 O 21 V 1 B 8 I 15 P 22 W 2 C 9 J 16 Q 23 X 3 D 10 K 17 R 24 Y 4 E 11 L 18 S 25 Z 5 F 12 M 19 T 6 G 13 N 20 U
-
ciascuno degli anzidetti quindici caratteri, a seconda che occupi posizione di ordine pari o posizione di
ordine dispari, viene convertito in un valore numerico, in base alle tabelle di corrispondenza successivamente
riportate. Per i sette caratteri con posizione di ordine pari:
CFCONTROLCHAR
- Dichiarazione
-
DECLARE EXTERNAL FUNCTION CFCONTROLCHAR
CSTRING(32) NULL,
CHAR(1)
RETURNS PARAMETER 2
ENTRY_POINT 'fn_cf_controlchar' MODULE_NAME 'EasyUDF'; - Esempi
-
CFCONTROLCHAR('RSSBBR69C48F839') = 'A'
CFCONTROLCHAR('PPPPLT80R10M082') = 'K'
CFCONTROLCHAR('XYZ') = '?'
CFCONTROLCHAR(null) = '?' - Descrizione
- Calcola il carattere di controllo del codice fiscale. Per maggiori informazioni consultare CFCHECK.
CIN
- Dichiarazione
-
DECLARE EXTERNAL FUNCTION CIN
CSTRING(32) NULL,
CHAR(1)
RETURNS PARAMETER 2
ENTRY_POINT 'fn_cin' MODULE_NAME 'EasyUDF'; - Esempi
-
CIN('0539014000000000091363') = 'Q'
CIN('0605500000000000000000') = 'I'
CIN(null) = '?' - Descrizione
- Calcola il codice CIN relativo ad un dato BBAN. Per maggiori informazioni consultare BBAN.
EMAILCHECK
- Declaration
-
DECLARE EXTERNAL FUNCTION EMAILCHECK
CSTRING(256) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_emailcheck' MODULE_NAME 'EasyUDF'; - Examples
-
EMAILCHECK('dummy@domain.it') > 0
EMAILCHECK('dummy@domain.i') = 0
EMAILCHECK('dum!my@domain.it') = 0
EMAILCHECK(null) = 0 - Description
- Fast e-mail validation based on [RFC2821] and [RFC2822] syntax. This IS NOT a DNS Lookup function.
FLOATTOSTR
- Declaration
-
DECLARE EXTERNAL FUNCTION FLOATTOSTR
DOUBLE PRECISION NULL,
CSTRING(256),
CSTRING(256)
RETURNS PARAMETER 3
ENTRY_POINT 'fn_floattostr' MODULE_NAME 'EasyUDF'; - Examples
-
FLOATTOSTR(6.012,'%.2f') = '6.01'
FLOATTOSTR(6.1,'%f%%') = '6.1%'
FLOATTOSTR(6.1,'%.0e') = '6e+00'
FLOATTOSTR(null,'%.2f') = '' - Description
-
Applies a format specifier, contained in the format string pointed to by second parameter, and outputs the
formatted data to third parameter. The format string contains two types of objects: plain characters, which
are simply copied to the output, and conversion specifications. Each conversion specification is introduced
by the character %. For more details see the C
printf()family of I/O functions.
GCD
- Declaration
-
DECLARE EXTERNAL FUNCTION GCD
INTEGER NULL,
INTEGER NULL
RETURNS INTEGER BY VALUE
ENTRY_POINT 'fn_gcd' MODULE_NAME 'EasyUDF'; - Examples
-
GCD(12,18) = 6
GCD(null,18) = 0 - Description
-
The Greatest Common Divisor (GCD) of two integers, not both zero, is the largest integer that divides both of them.
It is convenient to set GCD(0,0)=0.
Basic identities concerning the GCD:- GCD(N1,N2) = GCD(N2,N1);
- GCD(N1,N2) = GCD(-N1,N2);
- GCD(N1,0) = ABS(N1);
- Algorithm
-
Euclid's algorithm is described in the Elements of Euclid (circa 300 B.C.E.). It is based on the following
recursion theorem:
GCD(N1,N2) = GCD(N2,N1 mod N2)
GTINCHECK
- Declaration
-
DECLARE EXTERNAL FUNCTION GTINCHECK
CSTRING(256) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_gtincheck' MODULE_NAME 'EasyUDF'; - Examples
-
GTINCHECK('1017422567805') > 0
GTINCHECK('1017422567803') = 0
GTINCHECK('123456789012345613') > 0
GTINCHECK(null) = 0 - Description
-
As the name implies, the GTIN helps automate the
trading process, basically buying and selling. GTINs are therefore assigned to any item (product or
service) that may be priced, or ordered, or invoiced at any point in any supply chain. The GTIN is
then used to to retrieve pre-defined information about the item. The key benefit is that information
about the item can be retrieved about the product from the GTIN whether it is read in a GS1 Bar Code,
exchanged via a GS1 eCOM message or accessed from the Global Data Synchronisation Network.
The term Global Trade Item Number (GTIN) encompasses all the variants of what were previously known as: EAN-8, UPC-12 (UPC), EAN/UCC-13 and EAN/UCC-14 numbers. If a product is marked with an EAN-8, UPC-A, UPC-E, EAN-13 or a GS1-128 bar code, it meets GTIN requirement. - Algorithm
-
We'll look at the 8-digit version (the calculations for the 13-digit and 14-digit version are very similar).
Since the last digit is a check digit, only 7 of the 8 digits actually encode information.
The format uses a modulus 10 scheme, with check digits (ac) defined by:
For example, if we start with the number 1234567 in the EAN-8 scheme, then our check digit is:ac = -(a1, a2, … a6, a7) · (3, 1, 3, 1, 3, 1, 3) (mod 10)
which makes the full bar code number 12345670.ac = -(1, 2, 3, 4, 5, 6, 7) · (3, 1, 3, 1, 3, 1, 3) (mod 10) =
- (1·3 + 2·1 + 3·3 + 4·1 + 5·3 + 6·1 + 7·3) (mod 10) =
-60 (mod 10) = 0
If a digit d whose weight is 1 is changed to c, the weighted sum will change by d-c. The error will go undetected only ifd-c≡0 (mod 10). But this happens only when d=c, in which case there has not been an error after all, so all errors of this kind are caught. What if the weight were 3? Then the error would be undetected if3(d-c)≡0 (mod 10). But again, this cannot happen unless d=c. Thus, this method has a 100% single error detection rate. Suppose two adjacent digits, cd, are transposed to dc. If c's weight is 3 (hence d's weight is 1), the weighted sum is changed by(3c + d) - (c + 3d) = 2 (c-d)which will be detected unless2(c-d)≡0 (mod 10), which can happen only if c and d differ by 5. The same would have applied if c had been weighted by 1 and d by 3. As a result, the transpositions that will go undetected must involve 0↔5, 1↔6, 2↔7, 3↔8 and 4↔9. So, 10 transpositions are undetectable. There are 100 possibilities for each pairing, and the transposition of 90 of these would result in an error. Therefore the detection rate of transposition errors is 80/90 = 88.9%. - Links
-
GS1- The global language of business
Check digit calculator
Take a break by Emily Dixon
IBAN
- Declaration
-
DECLARE EXTERNAL FUNCTION IBAN
CSTRING(4) NULL,
CSTRING(32) NULL,
CSTRING(32)
RETURNS PARAMETER 3
ENTRY_POINT 'fn_iban' MODULE_NAME 'EasyUDF'; - Examples
-
IBAN('IT','Q0539014000000000091363') = 'IT45Q0539014000000000091363'
IBAN('AT','1904300234573201') = 'AT611904300234573201'
IBAN(null,'1904300234573201') = '' - Description
- The routing of payments internationally used to require the payer to inform the sending bank of the location of the receiving account (bank name, branch address) as well the account number of the destination account. The location of the receiving account is often identifiable from various routing codes which are often specific to the national payment system, and therefore are more readily machine processed than are names and addresses. National routing codes and account numbers often contain check digits which are used nationally to help detect transcription errors before payments are sent. However because the national systems vary, there was no common format for giving routing information that could be applied internationally. Prior to IBAN therefore, it was impractical for banks to validate such routing information prior to the sending of payments. Routing errors were therefore frequent causing payments to be delayed and often created costs to the sending and receiving banks and often to intermediate routing banks also. The standard IBAN is intended to carry all the routing information needed to get a payment from one bank to another wherever it may be and carries check digiting which can be validated in any country according to a single standard. In this way, the validity of a routing destination can be validated by the sending bank (or its customer) from a single string of data which contains all the necessary routing data to get money into the destination account and routing errors in international (or cross-border) payments are therefore virtually eliminated.
- Algorithm
-
The checksum is a basic ISO 7064 mod 97-10 calculation where the remainder must equal 1. To validate the checksum:
- check that the total IBAN length is correct as per the country. If not, the IBAN is invalid;
- move the four initial characters to the end of the string;
- replace the letters in the string with digits, expanding the string as necessary, such that A=10, B=11 and Z=35;
- convert the string to an integer and mod-97 the entire number.
- check that the total IBAN length is correct as per the country. If not, the IBAN is invalid;
- make the checksum digits 00 (e.g. GB00 for the UK);
- move the four initial characters to the end of the string;
- replace the letters in the string with digits, expanding the string as necessary, such that A=10, B=11 and Z=35;
- convert the string to an integer and mod-97 the entire number;
- subtract the remainder from 98 and pad with a leading 0, if necessary.
IP4CHECK
- Declaration
-
DECLARE EXTERNAL FUNCTION IP4CHECK
CSTRING(256) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_ip4check' MODULE_NAME 'EasyUDF'; - Examples
-
IP4CHECK('192.168.1.1') > 0
IP4CHECK('129.0.1') = 0
IP4CHECK('129.0.526.12') = 0
IP4CHECK(null) = 0 - Description
- Check IPv4 addresses represented in dotted octet format. A host identified by an IPv4 literal address is represented in dotted-decimal notation (a sequence of four decimal numbers in the range 0 to 255, separated by "."), as described in [RFC1123] by reference to [RFC0952].
- Links
- Wikipedia - IP4
LCM
- Declaration
-
DECLARE EXTERNAL FUNCTION LCM
INTEGER NULL,
INTEGER NULL
RETURNS INTEGER BY VALUE
ENTRY_POINT 'fn_lcm' MODULE_NAME 'EasyUDF'; - Examples
-
LCM(12,18) = 36
LCM(null,4) = 0 - Description
- The Least Common Multiple LCM of two integer N1 and N2. It is defined to be the smallest positive integer which is a multiple of both N1 and N2.
- Algorithm
- Equation N1·N2 = GCD(N1,N2)·LCM(N1,N2) reduces the calculation of LCM to the calculation of GCD.
LEVENSHTEIN
- Declaration
-
DECLARE EXTERNAL FUNCTION LEVENSHTEIN
CSTRING(32767) NULL,
CSTRING(32767) NULL
RETURNS INTEGER BY VALUE
ENTRY_POINT 'fn_ld' MODULE_NAME 'EasyUDF'; - Description
-
The words 'computer' and 'commuter' are very similar, and a change of just one letter, p->m will change the first
word into the second. The word 'sport' can be changed into 'sort' by the deletion of the 'p', or equivalently,
'sort' can be changed into 'sport' by the insertion of 'p'. The Levenshtein distance gives an indication of how
'close' two strings are.
LEVENSHTEIN(s1,s2), the Levenshtein distance of two strings s1 and s2, is the minimum number of mutations to change s1 into s2. Where mutation is defined as:- change a letter;
- insert a letter;
- delete a letter.
LEVENSHTEIN('','')=0;LEVENSHTEIN(s,'')=|s|LEVENSHTEIN(s1+c1,s2+c2)=min(LEVENSHTEIN(s1,s2)+(c1==c2 ? 0 : 1),
LEVENSHTEIN(s1+c1,s2)+1,
LEVENSHTEIN(s1,s2+c2)+1)
- if c1==c2 they can be matched for no penality (distance 0) and the Levenshtein distance is LEVENSTEIN(s1,s2);
- if c1 differs from c2, then c1 could be changed into c2 (distance 1), giving the overall cost LEVENSHTEIN(s1,s2)+1;
- we can delete c1 and edit s1 into s2+c2, giving the overall cost LEVENSHTEIN(s1,s2+c2)+1;
- the last possibility is to edit s1+c1 into s2 and then insert c2, giving the overall cost LEVENSTHTEIN(s1+c1,s2)+1.
- Algorithm
-
The recurrence relations imply an obvious ternary-recursive routine. This is not a good idea because it is
exponentially slow, and impractical for strings of more than a very few characters.
Examination of the relations reveals that LEVENSHTEIN(s1,s2) depends only on LEVENSHTEIN(s1',s2') where s1' is shorter than s1, or s2' is shorter than s2, or both. This allows the dynamic programming technique to be used.
A two-dimensional matrix, m[0…|s1|,0…|s2|] is used to hold the edit distance values:m[i,j]=LEVENSHTEIN(s1[1…i], s2[1…j])m[0,0]=0m[i,0]=i, i=1…|s1|m[0,j]=j, j=1…|s2|m[i,j]=min(m[i-1,j-1] + (s1[i]==s2[j] ? 0 : 1),
m[i-1,j]+1,
m[i,j-1]+1) i=1…|s1|, j=1…|s2|
EasyUDF implementation is O(n) in space complexity and treats transpositions as a single point mutations.. - Applications
-
- File Revision
- The Unix command diff f1 f2 finds the difference between files f1 and f2, producing an edit script to convert f1 into f2. If two (or more) computers share copies of a large file F, and someone on machine-1 edits F=F.bak, making a few changes, to give F.new, it might be very expensive and/or slow to transmit the whole revised file F.new to machine-2. However, diff F.bak F.new will give a small edit script which can be transmitted quickly to machine-2 where the local copy of the file can be updated to equal F.new. diff treats a whole line as a "character" and uses a special edit-distance algorithm that is fast when the "alphabet" is large and there are few chance matches between elements of the two strings (files). In contrast, there are many chance character-matches in DNA where the alphabet size is just 4, {A,C,G,T}.
- Remote Screen Update Problem
- If a computer program on machine-1 is being used by someone from a screen on (distant) machine-2, e.g. via rlogin, then machine-1 may need to update the screen on machine-2 as the computation proceeds. One approach is for the program (on machine-1) to keep a "picture" of what the screen currently is (on machine-2) and another picture of what it should become. The differences can be found (by an algorithm related to edit-distance) and the differences transmitted… saving on transmission band-width.
- Spelling Correction
- Algorithms related to the edit distance may be used in spelling correctors. If a text contains a word, w, that is not in the dictionary, a 'close' word, i.e. one with a small edit distance to w, may be suggested as a correction. Transposition errors are common in written text. A transposition can be treated as a deletion plus an insertion, but a simple variation on the algorithm can treat a transposition as a single point mutation.
- Plagiarism Detection
- The edit distance provides an indication of similarity that might be too close in some situations.
- Molecular Biology
- Similar measures are used to compute a distance between DNA sequences (strings over {A,C,G,T} or protein sequences over an alphabet of 20 amino acids) for various purposes: to find genes or proteins that may have shared functions or properties to infer family relationships and evolutionary trees over different organisms…
- Speech Recognition
- Algorithms similar to those for the edit-distance problem are used in some speech recognition systems: find a close match between a new utterance and one in a library of classified utterances.
- Examples
-
LEVENSHTEIN('sport','spot') = 1
LEVENSHTEIN('appropriate meaning','approximate matching') = 0,
LEVENSHTEIN(null,'dummy') = -1 - Links
- Dynamic Programming Algorithm (DPA) for Edit-Distance
LUHNCHECK
- Declaration
-
DECLARE EXTERNAL FUNCTION LUHNCHECK
CSTRING(256) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_luhncheck' MODULE_NAME 'EasyUDF'; - Examples
-
LUHNCHECK('456565654') = 1
LUHNCHECK(null) = 0 - Description
-
The Luhn algorithm or Luhn formula, also known as the "modulus 10" or "mod 10" algorithm, was developed in the 1960s
as a method of validating identification numbers. It is a simple checksum formula used to validate a variety of
account numbers, such as credit card numbers and Canadian Social Insurance Numbers. Much of its notoriety comes from
credit card companies' adoption of it shortly after its creation in the late 1960s by IBM scientist Hans Peter Luhn
(1896-1964). The algorithm is in the public domain and is in wide use today. It is not intended to be
cryptographically secure hash function; it protects against random error, not malicious attack. Most credit cards
and many government identification numbers use the algorithm as a simple method of distinguishing valid numbers
from collections of random digits.
Consider the example identification number 456-565-654. The first step is to double every other digit, starting with
the second-to-last digit and moving left, and sum the digits in the result. The following table shows this step
(highlighted rows indicating doubled digits):
30 ≡ 0 (mod 10) so the number is valid.
ai σ(ai) Sum: 30 4 4 5 1 6 6 5 1 6 6 5 1 6 6 5 1 4 4 - Algorithm
-
The formula generates a check digit, which is usually appended to a partial account number to generate the full
account number.
Let the credit card number be (a1, a2,…, a15, aN), with
aN being the check digit:
where the permutationaN = -[σ(a1) + a2 + σ(a3) + a4 + … + a14 + σ(aN-1)] (mod 10)σ = (0)(124875)(36)(9)(in other wordsσ(0)=0, σ(1)=2, σ(2)=4, σ(3)=6 … σ(9)=9). Notice that:
This scheme will catch all single-digit errors. For example if digit ai is changed from c to d, and i is even, the remainder will change by c-d, which is non-zero (and is, of course, smaller than the modulus N=10). If i is odd, it will change byσ(x) = (2·x) mod 9σ(c)-σ(d). This is again non-zero: σ(c) cannot be equal to σ(d) if σ is a permutation. How about transpositions? If two adjacent digits c and d are transposed, one of them must have the permutation applied - say c. The remainder will be unchanged only ifσ(c)+d = σ(d)+c. Sinceσ(x) = (2·x) mod 9, this happens only whenc ≡ d (mod 9), that is, only when c and d are 0 and 9 (in either order). Therefore, for each pair of adjacent digits, of the 90 possible transposition errors, two will be undetectable. So the detection rate for transpositions is 88/90 = 97.8%. - Links
- Take a break
PIVACHECK
- Dichiarazione
-
DECLARE EXTERNAL FUNCTION PIVACHECK
CSTRING(256) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_pivacheck' MODULE_NAME 'EasyUDF'; - Esempi
-
PIVACHECK('00000000000') > 0
PIVACHECK('01644560508') > 0
PIVACHECK('123456') = 0
PIVACHECK(null) = 0 - Descrizione
- Restituisce 0 nel caso il parametro in ingresso non costituisca una partita iva valida.
SHA256
- Declaration
-
DECLARE EXTERNAL FUNCTION SHA256
CSTRING(32767) NULL,
CSTRING(128)
RETURNS PARAMETER 2
ENTRY_POINT 'fn_sha256' MODULE_NAME 'EasyUDF'; - Examples
-
SHA256('abc') = 'ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad'
SHA256('abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq') = '248d6a61d20638b8e5c026930c3e6039a33ce45964ff2167f6ecedd419db06c1'
SHA256('The quick brown fox jumps over the lazy dog') = 'd7a8fbb307d7809469ca9abcb0082e4f8d5651e46d3cdb762d02d0bf37c9e592'
SHA256('') = 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
SHA256(null) = '' - Description
-
The SHA-256 hash function is a cryptographic hash
function designed by the NSA and published as a
US government standard. SHA-1 is employed in a large variety
of popular security applications and protocols, including TLS,
SSL, PGP,
SSH, S/MIME
and IPsec. It was considered to be the successor to MD5, an earlier, widely-used hash function. Both
are reportedly compromised.
The other four variants (SHA-224, SHA-256, SHA-384, and SHA-512) are sometimes collectively referred to as SHA-2 functions or simply SHA-2. No attacks have yet been reported on the SHA-2 variants, but since they are similar to SHA-1, researchers are worried, and are developing candidates for a new, better hashing standard.
SHA-256 is a hash function computed with 32-bit words. It have not received as much scrutiny by the public cryptographic community as SHA-1 has and so their cryptographic security is not yet as well-established. Gilbert and Handschuh (2003) have studied the newer variant and found no weaknesses.
A prime motivation for the publication of the Secure Hash Algorithm was the Digital Signature Standard, in which it is incorporated. The SHA hash functions have been used as the basis for the SHACAL block ciphers. - Algorithm
-
Note: All variables are unsigned 32 bits and wrap modulo 232 when calculating. Initialize variables (first 32 bits of the fractional parts of the square roots of the first 8 primes 2..19): h0 := 0x6a09e667 h1 := 0xbb67ae85 h2 := 0x3c6ef372 h3 := 0xa54ff53a h4 := 0x510e527f h5 := 0x9b05688c h6 := 0x1f83d9ab h7 := 0x5be0cd19 Initialize table of round constants (first 32 bits of the fractional parts of the cube roots of the first 64 primes 2..311): k(0..63) := 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 Pre-processing: append the bit '1' to the message append k bits '0', where k is the minimum number >= 0 such that the resulting message length (in bits) is congruent to 448 (mod 512) append length of message (before pre-processing), in bits, as 64-bit big-endian integer Process the message in successive 512-bit chunks: break message into 512-bit chunks for each chunk break chunk into sixteen 32-bit big-endian words w(i), 0 = i = 15 Extend the sixteen 32-bit words into sixty-four 32-bit words: for i from 16 to 63 s0 := (w(i-15) rightrotate 7) xor (w(i-15) rightrotate 18) xor (w(i-15) rightshift 3) s1 := (w(i-2) rightrotate 17) xor (w(i-2) rightrotate 19) xor (w(i-2) rightshift 10) w(i) := w(i-16) + s0 + w(i-7) + s1 Initialize hash value for this chunk: a := h0 b := h1 c := h2 d := h3 e := h4 f := h5 g := h6 h := h7 Main loop: for i from 0 to 63 s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22) maj := (a and b) xor (a and c) xor (b and c) t2 := s0 + maj s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25) ch := (e and f) xor ((not e) and g) t1 := h + s1 + ch + k(i) + w(i) h := g g := f f := e e := d + t1 d := c c := b b := a a := t1 + t2 Add this chunk's hash to result so far: h0 := h0 + a h1 := h1 + b h2 := h2 + c h3 := h3 + d h4 := h4 + e h5 := h5 + f h6 := h6 + g h7 := h7 + h Produce the final hash value (big-endian): digest = hash = h0 append h1 append h2 append h3 append h4 append h5 append h6 append h7 - Applications
- SHA-256 is a required secure hash algorithms for use in US Federal applications, including use by other cryptographic algorithms and protocols, for the protection of sensitive unclassified information. FIPS PUB 180-1 also encouraged adoption and use of SHA-1 by private and commercial organizations.
- Links
- SHA hash functions
SIACHECK
- Dichiarazione
-
DECLARE EXTERNAL FUNCTION SIACHECK
CSTRING(32) NULL
RETURNS SMALLINT BY VALUE
ENTRY_POINT 'fn_siacheck' MODULE_NAME 'EasyUDF'; - Esempi
-
SIACHECK('1234A') > 0
SIACHECK('5A123') = 0
SIACHECK(null) = 0 - Descrizione
- Restituisce 0 nel caso il parametro in ingresso non costituisca un codice Sia valido.
SOUNDEX
- Declaration
-
DECLARE EXTERNAL FUNCTION SOUNDEX
CSTRING(256) NULL,
CSTRING(8)
RETURNS PARAMETER 2
ENTRY_POINT 'fn_soundex' MODULE_NAME 'EasyUDF'; - Examples
-
SOUNDEX('Knightridder') = 'K523'
SOUNDEX('Ashcroft') = 'A261'
SOUNDEX('Pflanders') = 'P453'
SOUNDEX(null) = '' - Description
-
Terms that are often misspelled can be a problem for database designers. Names, for example, are
variable length, can have strange spellings, and they are not unique. American names have a
diversity of ethnic origins, which give us names pronounced the same way but spelled differently and
vice versa. Words can be misspelled or have multiple spellings, especially across different cultures
or national sources.
To help solve this problem, we need phonetic algorithms which can find similar sounding terms and names. Just such a family of algorithms exist and are called SoundExes, after the first patented version.
A Soundex search algorithm takes a word, such as a person's name, as input and produces a character string which identifies a set of words that are (roughly) phonetically alike. It is very handy for searching large databases when the user has incomplete data. The original Soundex algorithm was patented by Margaret O'Dell and Robert C. Russell in 1918. The method is based on the six phonetic classifications of human speech sounds (bilabial, labiodental, dental, alveolar, velar, and glottal), which in turn are based on where you put your lips and tongue to make the sounds. The algorithm is fairly straight forward to code and requires no backtracking or multiple passes over the input word. - Links
-
Dominic John Repici: Understanding the soundex algorithm
The Soundex Indexing System