Hi Art,
agree. The purpose for using masking should be clarified.
I never used LBAC. It may provide a solution to some aspects of the requirement? I just know that setting up and using LBAC is not the easiest thing to do.
Hi Andreas,
I don't think that the intention of masking is to change the underlying data. I assume that it should display non-identifiable data to non-authorized users, while it should still be possible to display the original data to authorized users.
Regards, Martin
-- Martin Fuerderer
Software Engineer
hcltechsw.com
::DISCLAIMER::
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.
Original Message:
Sent: 3/10/2023 5:58:00 AM
From: Art Kagel
Subject: RE: Data Masking in Informix
Martin:
The problem with just replacing all PII with 'xxxxx', if this is for the purpose of masking production data for test use, is that the queries will not behave as they would with the real data because every identifier will have the same value of 'xxxxx'! You have to use a masking or obfuscation that leaves the distribution of values similar to production.
Art
------------------------------
Art S. Kagel, President and Principal Consultant
ASK Database Management Corp.
www.askdbmgt.com
------------------------------
Original Message:
Sent: Fri March 10, 2023 04:38 AM
From: Martin Fuerderer
Subject: Data Masking in Informix
Hi Vicente,
I'm not sure about the hashing. I think, "masking" simply should not show any meaningful data, i.e. something like just "XXXX..." or the zero-digit for numerical data (possibly 0 for integers and 0.0 for decimals/floats).
Wouldn't hashing make the data somewhat "readable"?
Once the hashing algorithm is known, any user could himself hash a value he's interested in, then use this hash value in a where clause to get all the rows with the value he's interested in ... no? You only could use "=" in the where clause, i.e. no ">" or "<", but still a hash value provides too much info, I think.
Regards,
Martin
--
Martin Fuerderer
Software Engineer
hcltechsw.com
::DISCLAIMER::
The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only. E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of this message without the prior written consent of authorized representative of HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. Before opening any email and/or attachments, please check them for viruses and other defects.
Original Message:
Sent: 3/9/2023 8:39:00 AM
From: Vicente Salvador Cubedo
Subject: RE: Data Masking in Informix
In my experience, data should be stored unmasked as some "client" should retrieve original data and others masked data.
There should be a definition of masked columns: tablename + columname
By default, if a column is defined as masked depending of the column type:
CHAR,VARCHAR,LVARCHAR -> should retrieve a basic hash E.G. MD5 of the content
DATE OR DATETIME should change aleatory the day of the date and change hour minute or second if timestamp
NUMERIC COLUMN (FLOAT, DECIMAL, INTEGER) should retrieve a numeric hash of the column using some internal algorithm so same original number always masks to same destination number.
If a column is defined as hashed we need to know who is able to see the original data. All other connections should mask data in the resultset.
So, here is the tricky think: implementation
- You can add a list of users able to see original data without masking
Cons of this model is that a dba can add himself to this list, retrieve "protected" data and then remove himself from the authorization list. In this "sensible" info no unauthorized users should be able to get unmasked data - You can define a password for clients to see unmasked data
This password can ge defined at the database level and then only sessions sending proper password are able to see unmasked data (E.G. SET ENVIRONMENT mask_passwd "ABC123";)
Then the installer of the application can set the database mask pasword and nobody (even the DBAs) are able to get unmasked data. Only at the application level, the client session is able to send the password and nobody knows it.
Possible issues: in first implementation, I would allow only CHAR columns to be masked/hashed. The best ways is to allow lauch proper sqls without knowing the details, that is the reason I propose using a hashing algorith, so you can select how many names have your patients without knowing the real name.
Cons of hashing vs masking is that some column sizes should change for example MD5 hashing of a CHAR(15) column, returns a CHAR(32) string
This can be a mess, so maybe Informix should allow to mask/hash only columns larger than 32 or cut the hashing if column size is smaller.
E.G.
original CHAR(15) with content "ABCD" will return cb08ca4a7bb5f96
original CHAR(32) with content "ABCD" will return cb08ca4a7bb5f9683c19133a84872ca7
That's my two cents.
------------------------------
Vicente Salvador Cubedo
Original Message:
Sent: Thu March 09, 2023 06:12 AM
From: Marcus Haarmann
Subject: Data Masking in Informix
Hi,
what is your understanding of "data masking" ?
Should a specific column be "masked" in terms that the original content is overwritten with a number of "xxxxxxx" to make it unreadable ?
Should it be depending on the role of the user ? (e.g. only for specific users at query time, but the original content is stored unmasked)
Should data of a table be stored encrypted ? (this is not masking)
Original Message:
Sent: 3/9/2023 4:46:00 AM
From: Indika Jinadasa
Subject: Data Masking in Informix
Dear All,
Is "data masking" feature available in INFORMIX ? If it is not available pl. let us know the similar features in INFORMIX ?
Thanks !
Best Regards,
Indika
------------------------------
Indika Jinadasa
------------------------------