ayan@website $ me.pubs(journs, confs) _

As an academic, I publish my research work in top ranked conference and journals (CVPR, ECCV, SIGGRAPH, Elsevier PR) in Computer Vision and Deep Learning. The below list is a selection of my publications which I will try to keep up-to-date. In case I miss anything, you can visit the exhaustive list in my Google Scholar profile. Click on the title or READ button to access further details including bibtex entry. I also serve as reviewer for many of them (e.g. ICCV, SIGGRAPH etc).

[1] Cloud2Curve: Generation and Vectorization of Parametric Sketches

Author(s): Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
Venue: Computer Vision and Pattern Recognition (CVPR), 2021
Dated: 01 Mar 2021

Abstract: Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. We further aim to model sketches as a sequence of low-dimensional parametric curves. To this end, we propose an inverse graphics framework capable of approximating a raster or waypoint based stroke encoded as a point-cloud with a variable-degree Bézier curve. Building on this module, we present Cloud2Curve, a generative model for scalable high-resolution vector sketches that can be trained end-to-end using point-cloud data alone. As a consequence, our model is also capable of deterministic vectorization which can map novel raster or waypoint based sketches to their corresponding high-resolution scalable Bézier equivalent. We evaluate the generation and vectorization capabilities of our model...

[2] Pixelor: A Competitive Sketching AI Agent. So you think you can sketch?

Author(s): Ayan Kumar Bhunia*, Ayan Das*, Umar Riaz Muhammad*, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song (* Equal Contribution)
Venue: SIGGRAPH Asia, 2020
Link: http://sketchx.ai/pixelor
Dated: 30 Jul 2020

Abstract: We present the first competitive drawing agent Pixelor that exhibits human-level performance at a Pictionary-like sketching game, where the participant whose sketch is recognized first is a winner. Our AI agent can autonomously sketch a given visual concept, and achieve a recognizable rendition as quickly or faster than a human competitor. The key to victory for the agent is to learn the optimal stroke sequencing strategies that generate the most recognizable and distinguishable strokes first. Training Pixelor is done in two steps. First, we infer the optimal stroke order that maximizes early recognizability of human training sketches. Second, this order is used to supervise the training of a sequence-to-sequence stroke generator. Our key technical contributions are a tractable search of...

[3] BézierSketch: A generative model for scalable vector sketches

Author(s): Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, Yi-Zhe Song
Venue: European Conference on Computer Vision (ECCV), 2020
Dated: 22 May 2020

Abstract: The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process. The landmark SketchRNN provided breakthrough by sequentially generating sketches as a sequence of waypoints. However this leads to low-resolution image generation, and failure to model long sketches. In this paper we present BézierSketch, a novel generative model for fully vector sketches that are automatically scalable and high-resolution. To this end, we first introduce a novel inverse graphics approach to stroke embedding that trains an encoder to embed each stroke to its best fit Bézier curve. This enables us to treat sketches as short sequences of paramaterized strokes and thus train a recurrent...

[4] Keyword spotting in doctors handwriting on medical prescriptions

Author(s): Partha Pratim Roy, Ayan Kumar Bhunia, Ayan Das, Prithviraj Dhar, Umapada Pal
Venue: Expert Systems with Applications (ESWA)
Dated: 15 Jun 2017

Abstract: In this paper, we propose a word spotting based information retrieval approach for medical prescriptions/reports written by doctors. Sometimes due to almost illegible handwriting, it is difficult to understand the medication reports of doctors. This often confuses the patients about the actual medicine/disease names written by doctors and as a consequence they suffer. A medical prescription is generally partitioned into two parts, a printed letterhead part containing the doctor's name, designation, organization name, etc. and a handwritten part where the doctor writes patient's name and report his/her findings and suggests medicine names. There are many significance impacts of the proposed work. For example, such work can be used (i) to develop expert diagnostic systems (ii) to extract information from patient...

[5] Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy

Author(s): Ayan Das, Swagatam Das
Venue: Pattern Recognition Letters (PRL)
Dated: 01 Mar 2017

Abstract: Feature Selection (FS) is an important pre-processing step in machine learning and it reduces the number of features/variables used to describe each member of a dataset. Such reduction occurs by eliminating some of the non-discriminating and redundant features and selecting a subset of the existing features with higher discriminating power among various classes in the data. In this paper, we formulate the feature selection as a bi-objective optimization problem of some real-valued weights corresponding to each feature. A subset of the weighted features is thus selected as the best subset for subsequent classification of the data. Two information theoretic measures, known as ‘relevancy’ and ‘redundancy’ are chosen for designing the objective functions for a very competitive Multi-Objective Optimization (MOO) algorithm...

[6] HMM-based Indic handwritten word recognition using zone segmentation

Author(s): Partha Pratim Roy, Ayan Kumar Bhunia, Ayan Das, Prasenjit Dey, Umapada Pal
Venue: Pattern Recognition (PR)
Dated: 01 Dec 2016

Abstract: This paper presents a novel approach towards Indic handwritten word recognition using zone-wise information. Because of complex nature due to compound characters, modifiers, overlapping and touching, etc., character segmentation and recognition is a tedious job in Indic scripts (e.g. Devanagari, Bangla, Gurumukhi, and other similar scripts). To avoid character segmentation in such scripts, HMMbased sequence modeling has been used earlier in holistic way. This paper proposes an efficient word recognition framework by segmenting the handwritten word images horizontally into three zones (upper, middle and lower) and recognize the corresponding zones. The main aim of this zone segmentation approach is to reduce the number of distinct component classes compared to the total number of classes in Indic scripts. As a result,...

[7] A comparative study of features for handwritten Bangla text recognition

Author(s): Ayan Kumar Bhunia, Ayan Das, Partha Pratim Roy, Umapada Pal
Venue: 13th International Conference on Document Analysis and Recognition (ICDAR)
Dated: 23 Aug 2016

Abstract: Recognition of Bangla handwritten text is difficult due to its complex nature of having modifiers and headlines features. This paper presents a comparative study of different features namely LGH (Local Gradient of Histogram), PHOG (Pyramid Histogram of Oriented Gradient), GABOR, G-PHOG (Combined GABOR and PHOG) and profile feature by Marti-Bunke when applied in middle zone recognition of Bangla words using Hidden Markov Model (HMM) based framework. For this purpose, a zone segmentation method is applied to extract the busy (middle) zones of handwritten words and features are extracted from the middle zone. The system has been tested on a sufficiently large and variation-rich dataset consisting of 11,253 training and 3,856 testing data. From the experiment, it has been noted that...

[8] Retrieval of scene image and video frames using date field spotting

Author(s): Partha Pratim Roy, Ayan Das, Dipak Majhi, Umapada Pal
Venue: 3rd IAPR Asian Conference on Pattern Recognition (ACPR)
Dated: 03 Nov 2015

Abstract: In this paper, we present a date spotting based information retrieval system for natural scene image and video frames where text appears with complex backgrounds. Text retrieval in such scene/video frames is difficult because of blur, low resolution, background noise, etc. In our proposed framework, a line based date spotting approach using Hidden Markov Model is used to detect the date information in text. Given a text line image, we apply an efficient Bayesian classifier based binarization approach to extract the text components. Next, Pyramid Histogram of Oriented Gradient (PHOG) feature is computed from the binarized image for date-spotting framework. For our experiment, three different date models have been constructed to search similar date information in scene/video text. When tested...

[9] Handwritten word spotting in Indic scripts using foreground and background information

Author(s): Ayan Das, Ayan Kumar Bhunia, Partha Pratim Roy, Umapada Pal
Venue: 3rd IAPR Asian Conference on Pattern Recognition (ACPR)
Dated: 03 Nov 2015

Abstract: In this paper we present a line based word spotting system based on Hidden Markov Model for offline Indic scripts such as Bangla (Bengali) and Devanagari. We propose a novel approach of combining foreground and background information of text line images for keyword-spotting by character filler models. The candidate keywords are searched from a line without segmenting character or words. A significant improvement in performance is noted by using both foreground and background information than anyone alone. Pyramid Histogram of Oriented Gradient (PHOG) feature has been used in our word spotting framework and it outperforms other existing features of word spotting. The framework of combining foreground and background information has been evaluated in IAM dataset (English script) to show the...