image002Dr Jarosław Duda           (Jarek Duda)

 

Assistant professor at Institute of Computer Science (adiunkt),

Jagiellonian University

email:  jaroslaw.duda[at]uj.edu.pl

arXiv, GitHub, lectures, LinkedIn, ORCID, RGate, Scholar, stack, USOS, Wikipedia,
Wolfram – introductions to some my topics with demonstrations and basic code

 

Short CV:

2015-           Jagiellonian University, Institute of Computer Science, assistant professor,

2013-2014   Purdue University, NSF Center for Science of Information, Postdoctoral researcher (webpage),

2006-2012   Jagiellonian University, Cracow, PhD in Theoretical Physics (thesis)

2004-2010   Jagiellonian University, Cracow, PhD in Theoretical Computer Science (thesis)

2001-2006   Jagiellonian University, Cracow, MSc in Theoretical Physics (thesis)

2000-2005   Jagiellonian University, Cracow, MSc in Theoretical Mathematics (thesis)

1999-2004   Jagiellonian University, Cracow, MSc in Computer Science (thesis)

 

Main research areas:

Information theory/statistical physics - for my last MSc ([1] is its translation) I have worked on optimal encoding with constraints on a lattice (multidimensional generalization of Fibonacci coding), for example to improve storage capacity by more precise head positioning. The maximizing capacity way to choose statistical model (Maximal Entropy Random Walk – Wikipedia, 2018 article) was further developed for applications in physics as my second PhD. This 2006 MSc thesis has also started ANS coding and has lead me to a few new coding approaches (slides):

-        Asymmetric Numeral Systems (ANS, Wikipedia, materials, JU promotional animation, poster, introduction) family of entropy coders (heart of data compressors). Previously a compromise was needed: Huffman coding allowed for fast but suboptimal compression, arithmetic coding for nearly optimal but slow (costly). ANS offers compression ratio as arithmetic coding, at similar speed/cost as Huffman coding. For example Facebook  ZSTD (also used e.g. in Linux kernel, Android operating system, was standardized for email/html) and Apple LZFSE (default in macOS and iOS) use Finite State Entropy implementation of tANS variant, CRAM DNA compressor of European Bioinformatics Institute, Google Draco and JPEG XL next generation image compression, Dropbox DivANS use rANS variant. Additionally, chaotic behavior of tANS makes it also perfect for simultaneous encryption,

-        Constrained Coding: generalization of the Kuznetsov-Tsybakov problem: allowing to encode a message under some constraints, which are known only to the sender. This generalization allows to use statistical constraints, for example enforcing resemblance with a given picture (grayness of a pixel becomes probability of using 1 in its position). Natural applications are various watermarking/steganography purposes, for example to generate QR-like codes resembling a chosen image (implementation , ICIP paper, IEEE Forensics & Security paper),

-        Joint Reconstruction Codes (JRC, implementation): enhancement of the Fountain Codes concept, which allows to reconstruct a message from any large enough subset of packets. JRC additionally doesn’t need the sender to know the final individual damage levels of packets – this knowledge is required in standard approach to choose redundancy levels, but is often inaccurate or unavailable in real-life scenarios. For example, while writing a storage medium we usually don’t know how badly it will be damaged while reading. JRC allows the receivers to adapt to the actual noise levels, treated as independent trust levels for each packet while their joint reconstruction/error correction. Introduced continuous family of rates based on Renyi entropy allow to estimate statistical behavior of decoding (Pareto coefficient),

-        Correction trees philosophy as improvement of sequential decoding for convolutional codes: using larger state and bidirectional decoding, making it complementary alternative for state-of-art method (implementation). It also allows to handle synchronization errors like deletion channel.

 

Machine learning – searching for mathematically more sophisticated, but still practical methods. For example molecular shape descriptors (slides) for virtual screening – parametrization of shape by fitting general bending of molecule, then modelling cross-section as evolving ellipse.

-        Hierarchical Correlation Reconstruction (HCR, slides, talk, introduction) family of methods for prediction of (multivariate) probability densities by spitting dependencies into mixed moments, their time evolution. Perfect e.g. for systematic enhancement of ARMA/ARCH-like models: with proper tail handling, approaching any real joint distribution, allowing to model its time evolution for non-stationary time series – for example for predicting probability distribution of values in time series, also multivariate and credibility evaluation by modelling conditional distribution, nonstationarity analysis, multi-feature (auto)correlation analysis, and many others.

-        SGD Online Gradient Regression (OGR, slides, github, talk, introduction) optimizer family e.g for neural network training - currently dominated by 1st order methods with heuristic modifications like ADAM updating 2 averages. OGR approaches update e.g. 4 averages instead, this way providing real 2nd order method: ideally optimizing parabolas/paraboloids in a single steps, providing much faster convergence.

 

Maximal Entropy Random Walk (Wikipedia, last PhD, 2018 article, slides, talk, introduction, application to 2D Ising model, electron diffusion p-n junction (diode) model, introduction): standard stochastic models are based on philosophy that the object performs successive random decisions using probabilities chosen arbitrarily by us. In contrast, in statistical physics this randomness only represents our lack of knowledge. Such models should be based on the maximal entropy principle (Jaynes), or equivalently: choosing e.g. canonical ensemble, getting recent Maximal Entropy Random Walk (MERW) and its extensions. Thanks of constructing models finally fulfilling this fundamental mathematical requirement, in contrast to standard approach (which can be seen as approximation), we finally get agreement with thermodynamical expectations of quantum mechanics, like thermalization to the quantum mechanical ground state probability density and Born rule: ‘squares’ relating amplitudes and probabilities. My work on this subject has started with my physics MSc thesis ([1] is its translation), where the equations were found for information theory applications. Here is conductance simulator to compare both philosophies.

 

Topological soliton particle models (framework, slides, github, talk, introduction): Skyrme has made popular the search for alternative approach to particle models  – starting not as usually with leading to many mathematical problems QFT perturbative approximation, but with trying to understand the configuration of fields building the particle (e.g. electromagnetic), which generally should maintain its structure (be a soliton), for example due to topological constraints for spin and charge. Standard skyrmion approach introduces separate fields to model single mesons or baryons – the perfect situation would be having just a single field, which family of topological excitations agrees with particle menagerie and their dynamics, with topological charges as quantum numbers. Simple modification of liquid crystal Landau-de Gennes model provides surprisingly good agreement with particle physics: starting with Coulomb interaction, 3 leptons, baryons with proton lighter than neutron.

 

Complex Base Numeral Systems (first two MSc-s, slides, presentation, introduction) : probably complete family of positional numeral systems with complex base, which are ‘proper’ – representation function from digit sequences into a complex plane is surjective and injective everywhere but a zero measure set (it’s unavoidable, like 0.999(9)=1.000(0) ). Fractional part occurs to be simple Iterated Function System (fractal). I have also introduced practical methods for arithmetic in this representation, analytical tool to work with convex hull of such simple fractals, to get analytical formulas for Hausdorff dimension of boundary of such sets and briefly generalization into higher dimensions. It is described in [2] and [3].

 

Other interests and hobbies:

-        P vs NP, graph isomorphism problem, (also for quantum computing), Markov fields, DNA reconstruction.

-        Biology, e.g. evolutionism, neurobiology, biochemistry. For example chiral life concept (Wikipedia) – as a computer scientist, while starting studying genetics I thought about modifying the rules how triples of nucleotides are translated into amino-acids, to get immunity by incompatibility with our viruses. This approach has a lot of issues, but later in 2007 it has lead me to the possibility of synthesizing mirror version of standard cells (original forum post). It turns out that the race has recently started, e.g. in 2016 reaching synthesis of mirror polymerase (enantiomer). While mirror life carries enormous new possibilities including pathogen-immune humans, the dangers of such synthetic life may include eradication of our life – mirror photosynthesizing cyanobacteria could dominate our ecosystem. Hence, I believe there is now required a wide discussion about the ongoing race to this synthesis.

-        Others: dancing, climbing, biking, fencing, photography

 

Articles:

[1] J. Duda, Optimal encoding on discrete lattice with translational invariant constrains using statistical algorithms, arXiv:0710.3861 (2007)

[2] J. Duda, Analysis of the convex hull of the attractor of an IFS, arXiv:0710.3863 (2007)

[3] J. Duda, Complex base numeral systems, arXiv:0712.1309 (2007)

[4] J. Duda, Combinatorial invariants for graph isomorphism problem, arXiv:0804.3615 (2008)

[5] Z. Burda, J. Duda, J. M. Luck, B. Wacław, Localization of the Maximal Entropy Random Walk, Phys. Rev. Lett. 102, 160602 (2009)

[6] J. Duda, Asymmetric numeral systems, arXiv:0902.0271 (2009)

[7] J. Duda, Four-dimensional understanding of quantum mechanics, arXiv:0910.2724 (2009)

[8] Z. Burda, J. Duda, J. M. Luck, B. Wacław, The various facets of random walk entropy, Acta Phys. Polon. B. 41/5 (2010)

[9] J. Duda, From Maximal Entropy Random Walk to quantum thermodynamics, arXiv:1111.2253 (2011) (slides)

[10] J. Duda, P. Korus, Correction Trees as an Alternative to Turbo Codes and Low Density Parity Check Codes, arXiv: 1204.5317 (2012)

[11] J. Duda, Optimal compression of hash-origin prefix trees, arXiv:1206.4555 (2012) (slides)

[12] J. Duda, Embedding grayscale halftone pictures in QR Codes using Correction Trees, arXiv:1211.1572 (2012) (slides)

[13] J. Duda, From Maximal Entropy Random Walk to quantum thermodynamics, J. Phys.: Conf. Ser. 361 012039 (2012)

[14] J. Duda, Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding, arXiv:1311.2540 (2013) (slides)

[15] Y. Baryshnikov, J. Duda, W. Szpankowski, Markov Fields Types and Tilings, ISIT 2014 (2014)

[16] J. Duda, N. Gadgil, K. Tahboud, E. J. Delp, Generalizations of the Kuznetsov-Tsybakov problem for generating image-like 2D barcodes, ICIP 2014 (2014)

[17] J. Duda, Joint error correction enhancement of the Fountain Codes concept, arXiv:1505.07056 (2015)

[18] J. Duda, Normalized rotation shape descriptors and lossy compression of molecular shape, arXiv:1505:09211 (2015) (slides)

[19] J. Duda, N. Gadgil, K. Tahboud, E. J. Delp, The use of Asymmetric Numeral Systems as an accurate replacement for Huffman coding, PCS 2015 (PDF)

[20] J. Duda, G. Korcyl, Designing dedicated data compression for physics experiments within FPGA already used for data acquisition, arXiv:1511.00856 (2015)

[21] J. Duda, P. Korus, N. J. Gadgil, K. Tahboub, E. J. Delp, Image-Like 2D Barcodes Using Generalizations Of The Kuznetsov-Tsybakov Problem, IEEE Transactions on Information Forensics & Security volume 11, issue 4 (2016)

[22] J. Duda, W. Szpankowski, A. Grama, Fundamental Bounds and Approaches to Sequence Reconstruction from Nanopore Sequencers, arXiv:1601.02420 (2016)

[23] J. Duda, Distortion-Resistant Hashing for rapid search of similar DNA subsequence, arXiv:1602.05889 (2016)

[24] Y. Baryshnikov, J. Duda, W. Szpankowski, Types of Markov Fields and Tilings, IEEE Transactions of Information Theory volume 62, issue 8 (PDF) (2016)

[25] J. Duda, Nonuniform probability modulation for reducing energy consumption of remote sensors, arXiv:1608.04271 (2016)

[26] J. Duda, Practical estimation of rotation distance and induced partial order for binary trees, arXiv:1610.06023 (2016)

[27] A. Magner, J. Duda, W. Szpankowski, A. Grama, Fundamental Bounds for Sequence Reconstruction from Nanopore Sequencers, IEEE Transactions on Molecular, Biological, and Multi-Scale Communications (2016)

[28] J. Duda, M. Niemiec, Lightweight compression with encryption based on Asymmetric Numeral Systems, arXiv:1612.04662 (2016)

[29] J. Duda, Rapid parametric density estimation, arXiv:1702.02144 (2017) (slides)

[30] J. Duda, P?=NP as minimization of degree 4 polynomial, integration or Grassmann number problem, and new graph isomorphism problem approaches, arXiv:1703.04456 (2017) (slides)

[31] J. Duda, Improving Pyramid Vector Quantizer with power projection, arXiv:1705.05285 (2017)

[32] J. Duda, Four-dimensional understanding of quantum mechanics and computation, arXiv:0910.2724v2 (2017)

[33] J. Duda, Polynomial-based rotation invariant features, arXiv:1801.01058 (2018)

[34] J. Duda, Hierarchical correlation reconstruction with missing data, for example for biology-inspired neuron, arXiv:1804.06218 (2018) (slides)

[35] J. Duda, Exploiting statistical dependencies of time series with hierarchical correlation reconstruction, arXiv:1807.04119 (2018)

[36] J. Duda, M. Snarska, Modeling joint probability distribution of yield curve parameters, arXiv:1807.11743 (2018)

[37] J. Duda, Gaussian Auto-Encoder, arXiv:1811.04751 (2018) (slides)

[38] J. Duda, A. Szulc, Credibility evaluation of income data with hierarchical correlation reconstruction, arXiv:1812.08040 (2018) (slides)

[39] J. Duda, Improving SGD convergence by online linear regression of gradients in multiple statistically relevant directions, arXiv:1901.11457 (2019) (slides, github),

[40] M. Mikulski, J. Duda, Toroidal AutoEncoder, arXiv:1903.12286 (2019)

[41] J. Duda, Parametric context adaptive Laplace distribution for multimedia compression, arXiv:1906.03238 (2019) (slides)

[42] J. Duda, SGD momentum optimizer with step estimation by online parabola model, arXiv:1907.07063 (2019) (slides)

[43] J. Duda, R. Syrek, H. Gurgul, Modelling bid-ask spread conditional distributions using hierarchical correlation reconstruction, arXiv:1911.02361(2019), Statistics in Transition vol 21 no 4 (2020) (slides)

[44] J. Duda, Nearly accurate solutions for Ising-like models using Maximal Entropy Random Walk, arXiv:1912.13300 (2019) (slides, talk)

[45] J. Duda, Adaptive exponential power distribution with moving estimator for nonstationary time series, arXiv:2003.02149 (2020)

[46] J. Duda, A. Szulc, Social Benefits Versus Monetary and Multidimensional Poverty in Poland: Imputed Income Exercise, ICOAE 2019 (2020)

[47] J. Duda, Exploiting context dependence for image compression with upsampling, arXiv:2004.03391 (2020)

[48] J. Duda, G. Bhatta, Log-stable probability density functions, non-stationarity evaluation, and multi-feature autocorrelation analysis of the γ-ray light curves of blazars, arXiv:2005.14040, Monthly Notices of the Royal Astronomical Society Main Journal (2021)

[49] J. Duda, Improving distribution and flexible quantization for DCT coefficients, arXiv:2007.12055 (2020)(slides)

[50] S. Camtepe, J. Duda, A. Mahboubi, P. Morawiecki, S. Nepal, M. Pawłowski, J. Pieprzyk, Compcrypt – Lightweight ANS-based Compression and Encryption, https://eprint.iacr.org/2021/010, IEEE Transactions on Information Forensics & Security (2021)

[51] J. Duda, Encoding of probability distributions for Asymmetric Numeral Systems, arXiv:2106.06438 (2021)

[52] J. Duda, H Gurgul, R. Syrek, Multi-feature evaluation of financial contagion, Central European Journal of Operations Research (2021)

[53] S. Camtepe, J. Duda, A. Mahboubi, P. Morawiecki, S. Nepal, M. Pawłowski, J. Pieprzyk, ANS-based Compression and Encryption with 128-bit Security, https://eprint.iacr.org/2021/900 (2021), International Journal of Information Security (2022)

[54] J. Duda, Framework for liquid crystal based particle models, arXiv:2108.07896 (2021) (slides, github),

[55] J. Duda, Diffusion models for atomic scale electron currents in semiconductor, p-n junction, arXiv:2112.12557 (2021) (slides, talk)

[56] A. Mahboubi, K. Ansari, S. Camtepe, J. Duda, P. Morawiecki, M. Pawłowski, J. Pieprzyk, Digital Immunity Module: Preventing Unwanted Encryption using Source Coding, techrxiv (2022)

[57] J. Pieprzyk, M. Pawlowski, P. Morawiecki, A. Mahboubi, J. Duda, S. Camtepe, Pseudorandom Bit Generation with Asymmetric Numeral Systems, https://eprint.iacr.org/2022/005 (2022)

[58] J. Duda, Context binning, model clustering and adaptivity for data compression of genetic data, arXiv:2201.05028 (2022) (slides)

[59] J. Duda, Fast optimization of common basis for matrix set through Common Singular Value Decomposition, arXiv:2204.08242 (2022)

[60] J. Duda, Predicting conditional probability distributions of redshifts of Active Galactic Nuclei using Hierarchical Correlation Reconstruction, arXiv:2206.06194 (2022)

[61] J. Duda, S.Podlewska, Low cost prediction of probability distributions of molecular properties for early virtual screening, arXiv:2207.11174 (2022)

[62] J. Pieprzyk, J. Duda, M. Pawlowski, S. Camtepe, A. Mahboubi, P. Morawiecki, Compression Optimality of Asymmetric Numeral Systems, arXiv:2209.02228, Entropy (2023)

[63] J. Duda, Predicting probability distributions for cancer therapy drug selection optimization, arXiv:2209.06211 (2022)

[64] J. Duda, S. Podlewska, Prediction of probability distributions of molecular properties: towards more efficient virtual screening and better understanding of compound representations, Molecular Diversity (2022)

[65] J. Duda, M. Niemiec, Lightweight compression with encryption based on asymmetric numeral systems, AMCS vol 33 (2023)

[66] J. Duda, Adaptive Student’s t-distribution with method of moments moving estimator for nonstationary time series, arXiv:2304.03069 (2023) (slides, talk)

[67] J. Duda, Time delay multi-feature correlation analysis to extract subtle dependencies from EEG signals, arXiv:2305.09478 (2023)
[68] J. Duda, Two-way quantum computers adding CPT analog of state preparation, arXiv:2308.13522 (2023) (slides, talk)

[69] J. Duda, Extracting individual variable information for their decoupling, direct mutual information and multi-feature Granger causality, arXiv:2311.13431 (2023)

[70] J. Duda, Phase space maximal entropy random walk: Langevin-like ensemble of physical trajectories, arXiv:2401.01239 (2024)

[71] J. Duda,  Simple inexpensive vertex and edge invariants distinguishing dataset strongly regular graphs, arXiv:2402.04916 (2024)

[72] J. Duda, J. Leśkow, P. Pawlik, W. Cioch, CMAFI — Copula-based Multifeature Autocorrelation Fault Identification of rolling bearing, Mechanical Systems and Signal Processing (2024)

 

My interactive demonstrations presenting some my work in intuitive way: https://community.wolfram.com/web/dudaj :

Some my implementations: https://github.com/JarekDuda, video lectures: https://www.youtube.com/channel/UCbajruVGXJ7lsJKHPcqE7Cw

Quantum foundations seminar (old), now QM Foundations & Nature of Time seminar