[Pandas-dev] How does Pandas from python reads the timestamp to datetime64? (HDF5)

Arthuro Verissimo arthurobdfv at gmail.com
Thu Jun 14 13:47:48 EDT 2018


i'm creating an application on C# that generates a hdf5 document with one
row of the compound data being a timestamp (using unix timestamp), and i
wish i could load this hdf5 in pandas and it load the timestamp column as a
datetime64. I've tried to create the file from pandas and i saw it creates
the dataset with a lot of attributes that i assume pandas read and do some
logic to convert it while reading the file (the datetime inside the file
created from pandas is a timestamp too), i can't figure it out...

Here is my dataset creation code and some screenshots of what i had
investigated

public void CriaHDF5Customizado(PackingConfigFile pf)
{

       // PopulaOPC(pf);

        H5FileId arquivoHDF5 =
H5F.create("C:/Users/arthuro/Desktop/timestampteste.h5",
H5F.CreateMode.ACC_TRUNC);
        H5GroupId datasetGroup = H5G.create(arquivoHDF5, "Datasets");
        H5GroupId infosGroup = H5G.create(arquivoHDF5, "Informations");
        H5G.close(infosGroup);

        opcSt opHelper = new opcSt();
        opHelper.dt = (Int64)DateTime.Parse("1996-11-9").Subtract(new
DateTime(1970, 1, 1,12,00,00, DateTimeKind.Utc)).TotalSeconds;
        opHelper.qlt = (Int64)DateTime.Now.Subtract(new DateTime(1970,
1, 1, 0, 0, 0, DateTimeKind.Utc)).TotalSeconds;
        opHelper.vl = 123456.7f;


        int structsize = Marshal.SizeOf(opHelper);
        H5DataTypeId myOPCType = H5T.create(H5T.CreateClass.COMPOUND,
structsize);
        //H5T.insert(myOPCType, "TimeStamp", 0, new
H5DataTypeId(H5T.H5Type.NATIVE_LONG));
        H5T.insert(myOPCType, "Quality", 8, new
H5DataTypeId(H5T.H5Type.NATIVE_LONG));
        H5T.insert(myOPCType, "Value", 16, new
H5DataTypeId(H5T.H5Type.NATIVE_FLOAT));


        long[] ds = new long[1];
        ds[0] = 1;
        H5DataTypeId dtDatetime = H5T.create_array(new
H5DataTypeId(H5T.H5Type.NATIVE_LLONG), ds);
        H5T.insert(myOPCType, "TimeStamp", 0, dtDatetime);

        foreach (BasicVariable bv in pf.basicVariableList.bvList)
        {
            bv.PopulateOPCUA();
            var bvnow = bv;
            var bvNro = pf.basicVariableList.bvList.IndexOf(bv);
            long[] dims = new long[1];
            dims[0] = bv.bvData.Count;
            H5DataSpaceId myDataSpace = H5S.create_simple(1, dims);
            H5DataSetId bvDset = H5D.create(datasetGroup, bv.bvTag,
myOPCType, myDataSpace);
            var arrayaux = new List<opcSt>();
            foreach (OPC_UA opc in bv.bvData)
            {
                var aux = new opcSt(opc.timeStamp, opc.quality,
(float)opc.data);
                arrayaux.Add(aux);
            }
            H5D.write(bvDset, myOPCType, new
H5Array<opcSt>(arrayaux.ToArray()));
            string[] stringteste = { "datetime64" };
            H5DataTypeId attrDt = H5T.copy(H5T.H5Type.C_S1);
            H5T.setSize(attrDt, stringteste[0].Length);
            var longsz = new long[] { 1 };
            var enc = new System.Text.ASCIIEncoding();
            var array1 = enc.GetBytes(stringteste[0]);
            var charArray = new byte[stringteste[0].Length + 1];
            array1.CopyTo(charArray, 0);
            charArray[stringteste[0].Length] = 0;
            H5DataSpaceId atribDS = H5S.create_simple(1, longsz);

            H5AttributeId Attribs = H5A.create(bvDset,
"TimeStamp_dtype", attrDt, atribDS);
            H5A.write(Attribs, attrDt, new H5Array<byte>(charArray));

            stringteste[0] = "(lp0L0La.";

            H5T.setSize(attrDt, stringteste[0].Length);
            longsz = new long[] { 1 };
            enc = new System.Text.ASCIIEncoding();
            array1 = enc.GetBytes(stringteste[0]);
            charArray = new byte[stringteste[0].Length + 1];
            array1.CopyTo(charArray, 0);
            charArray[stringteste[0].Length] = 0;
            atribDS = H5S.create_simple(1, longsz);

            H5AttributeId Attribs2 = H5A.create(bvDset,
"TimeStamp_kind", attrDt, atribDS);
            H5A.write(Attribs2, attrDt, new H5Array<byte>(charArray));

            stringteste[0] = "N.";

            H5T.setSize(attrDt, stringteste[0].Length);
            longsz = new long[] { 1 };
            enc = new System.Text.ASCIIEncoding();
            array1 = enc.GetBytes(stringteste[0]);
            charArray = new byte[stringteste[0].Length + 1];
            array1.CopyTo(charArray, 0);
            charArray[stringteste[0].Length] = 0;
            atribDS = H5S.create_simple(1, longsz);

            H5AttributeId Attribs3 = H5A.create(bvDset,
"TimeStamp_meta", attrDt, atribDS);
            H5A.write(Attribs3, attrDt, new H5Array<byte>(charArray));


            H5S.close(atribDS);
            H5A.close(Attribs);
            H5A.close(Attribs2);
            H5A.close(Attribs3);
            H5D.close(bvDset);
            H5S.close(myDataSpace);
            bv.bvData = new List<OPC_UA>();
        }

        H5G.close(datasetGroup);
        H5F.close(arquivoHDF5);
    }


Pandas reading (timestamp and datetime):

being 1- The value read inside jupyter notebook (processed by pandas)
           2- The real value inside the HDF5 File

https://i.stack.imgur.com/ZZJ3A.png
<https://github.com/pandas-dev/pandas/issues/url>

Ahd these are the attributes of the table created by pandas (that i've
tried to recreate in my code)

https://i.stack.imgur.com/U1Zom.png
<https://github.com/pandas-dev/pandas/issues/url>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180614/9e20362e/attachment.html>


More information about the Pandas-dev mailing list