Intern - Data and Applied Scientist

You must Sign In before continuing to the company website to apply.

Smart SummaryPowered by Roshi

NetApp is seeking a Data & Applied Scientist Intern to join the Data Services organization. The intern will create synthetic data, improve NLP models, co-author papers, and work on computer vision and NLP projects.

About NetApp

NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. No matter the data type, workload or environment, we help our customers identify and realize new business possibilities. And it all starts with our people.

If this sounds like something you want to be part of, NetApp is the place for you. You can help bring new ideas to life, approaching each challenge with fresh eyes. We embrace diversity and openness because it's in our DNA. Of course, you won't be doing it alone. At NetApp, we're all about asking for help when we need it, collaborating with others, and partnering across the organization - and beyond.

"At NetApp, we fully embrace and advance a diverse, inclusive global workforce with a culture of belonging that leverages the backgrounds and perspectives of all employees, customers, partners, and communities to foster a higher performing organization."-George Kurian, CEO

Job Summary

NetApp is seeking a capable Data & Applied Scientist Intern to join Data Services organization. The overarching vision of this organization is to empower organizations to effectively govern their data estate and build cyber-resiliency while accelerating their digital transformation journey. To get to this vision, we will embark on an AI-first approach to build and deliver world-class suite of data services. As part of this initiative, the Data & Applied Scientist Intern will be responsible for creating synthetic data essential and improving our NLP (Natural Language Processing) models that are critical for NetApp’s strong data governance capabilities.

This internship is going to be a challenging and a fun role in one of the most exciting roles in the industry today. It also offers valuable opportunities for the intern to co-author research papers, submit patents, contribute to open-source projects, enhancing their expertise in Computer Vision and NLP.

Job Requirements

Assist in the development of synthetic data that accurately mirrors real-world data.
Design quality metrics for generated data demonstrating high data quality.
Partner with in-house researchers and improve NLP models with synthetic datasets.

Required Qualifications

Master’s in computer science, Engineering, Applied Mathematics/Statistics/Data Science or equivalent skills.
Strong drive for creating value with skills such as Python, modern ML frameworks (PyTorch, transformers) and cloud platforms for experimentation.
Excellent communication and collaboration skills to work closely with in-house researchers and engineers.

Preferred Qualifications

Any relevant industry experience in AI/ML, with a record of research or shipping successful products at scale.
Demonstrated curiosity and tinkering with LLMs or AI agents.
Solid understanding of deep learning approaches in NLP and computer vision domains.
Active GitHub profile showcasing relevant ML projects or Kaggle achievements.

Set alert for similar jobsIntern - Data and Applied Scientist role in Bengaluru, India

Company

NetApp

Job Posted

7 months ago

Job Type

Full-time

WorkMode

On-site

Experience Level

0-2 Years

Related Jobs

Data Scientist - Intern

Airbus

Bengaluru, Karnataka, India

Posted: 17 days ago

Job Description: Airbus Innovation Centre - India & South Asia:    Airbus Innovation Centre - India & South Asia is responsible for industrializing disruptive technologies by tapping into the strong engineering competencies centre while also leveraging and co-creating with the vibrant external ecosystems such as big Tech Enterprises, mature startups/MSMEs, national labs & universities and strategic partners (customers, suppliers etc.) The technology areas that the Innovation Centre focus on are - Artificial Intelligence, Industrial Automation, Unmanned Air Systems, Connectivity, Space Tech, Autonomy, Decarbonization Technologies etc. among others. Airbus Innovation Centre in India is 1 among 3 Innovation Centres globally for Airbus with a strong focus on A.I. and Digital Engineering. We build products from the ground up with the help of stakeholders from within Engineering and Digital competence centres (in addition to the external stakeholders mentioned above) to deliver operational excellence and contribute to the Innovation & Technology roadmap of the organization.   Title:  High-Dimensional Constrained Design of Experiments for ML applications   Introduction: Surrogate models are used in the area of multidisciplinary analysis and optimization. These surrogate models have the advantage over simulations that they can approximate the effects of parameter variations in real time. This enables savings in terms of time and costs when developing a new aircraft or aircraft variants. In addition, more variations of the parameters can be performed. The optimal point of design can be searched for and the necessary knowledge about the interrelationships of the parameters at the point of design is provided. These surrogate models are (in our use cases) Machine Learning (ML) models. Accordingly, a data set must be available for the training of these surrogate models. The Design of Experiments (DoE) methodology is used to create an optimal data set for this purpose. The goal is to map ‘m’ simulation inputs to ‘n’ simulation outputs. The larger goal is to create an adaptive DoE which, based on the needs, either finds the optimum dataset or does active learning to increase the performance of the subsequently built surrogate model. Since the simulations are performed sequentially, it is possible to use the already calculated data points to determine the position in the design space where data points have the largest amount of information.   However, before investigating further we want to focus again more on the DoE. In order to decrease the design space, i.e. the space that optimization algorithms have to search through, a constraint DoE was developed. The aim of this work is to reach practical application of this constraint DoE by adding further input dimensions. A full description of an existing DoE to translate into a constraint DoE is available. It works today with a cubical “base” DoE whose domain is transformed, in a post-processing step, to comply with the underlying constraints. The problem with this methodology though is that the final sample distribution is not homogeneous. This again leads to potential bias and unnecessary large sample sizes for ML applications.   Key Responsibilities: Today we have two constraint DoE's: 1. zerofuel_mass, zerofuel_cg, fuel_weight 2. altitude, speed, vertical_loadfactor   Learn about the existing DoE libraries (OpenTurns, JohnDoE) by producing unit tests, docstrings, and documentation   For the 1st DoE you switch from a temporary Fuel vector implementation to the official implementation   Enhance the 1st DoE by increasing the complexity through an additional dimension: fuel density Enhance the 1st DoE by increasing the complexity through splitting dimension fuel_weight into: re_fuel_weight and de_fuel_weight   Merge the 1st and 2nd DoE into one DoE add all missing independent further dimensions to reach practical applicability.   Qualifications: Strong Python skills. Statistics background Experience with Data Wrangling and Preprocessing. Experience in Design of Experiments for Data Generation. Proficiency in version control systems (e.g., Git) and software development best practices. Machine Learning & Deep Learning Model Development Cycle.   EDUCATION: M.Sc . / M.Eng. in Computer Science, Date Engineering, Mathematics, Aerospace

Data and Applied Scientist-II

Microsoft

Bengaluru, Karnataka, India

Posted: 2 years ago

Join our team at Microsoft Security and help reshape security by protecting customers from digital threats. Develop advanced algorithms and models to detect and neutralize social engineering attacks. Stay updated with the latest research in AI and security, and leverage your machine learning expertise to enhance our products in Microsoft Defender and Microsoft Purview. Handle large datasets, utilize big data tools, and code in languages like Python and SQL. Apply statistical analysis and deep learning techniques to solve real-world problems. Join us in making the world a safer place.

Applied Scientist Intern

Amazon

Bengaluru, Karnataka, India

Posted: 2 years ago

Join as an Applied Scientist Intern at an industry-leading company, working on cutting-edge Machine Learning, NLP, Deep Learning, and Computer Vision algorithms to solve real-world problems. This full-time opportunity requires a commitment to innovation and fearless disruption in a fast-paced environment. Location: Bangalore, Hyderabad, Chennai.

Intern

NetApp

Bengaluru, Karnataka, India

Posted: a month ago

About NetApp NetApp is the intelligent data infrastructure company, turning a world of disruption into opportunity for every customer. No matter the data type, workload or environment, we help our customers identify and realize new business possibilities. And it all starts with our people. If this sounds like something you want to be part of, NetApp is the place for you. You can help bring new ideas to life, approaching each challenge with fresh eyes. Of course, you won't be doing it alone. At NetApp, we're all about asking for help when we need it, collaborating with others, and partnering across the organization - and beyond.   Job Summary NetApp is seeking a capable Data & Applied Scientist Intern to join their newly formed Data Services organization. The overarching vision of this organization is to empower organizations to effectively govern their data estate to accelerate their digital transformation journey. To get to this vision, we will embark on an AI-first approach to build and deliver world-class suite of data services. As part of this initiative, the Data & Applied Scientist Intern will be collaborating with other data scientists and engineers for developing and implementing cutting edge AI/ML algorithms/frameworks across the key pillars of data protection, privacy and compliance. Job Requirements •    Assist in the development of supervised/unsupervised machine learning and deep learning models across the areas of general machine learning and natural language processing.  •    Assist in the design and development of large and complex proof-of-concepts. •    Assist in the development of end-to-end ML pipelines.  •    Any relevant industry experience in AI/ML, with a track record of research or shipping successful products at scale. •    Excellent communication and collaboration skills, with the ability to work effectively with cross-functional teams and stakeholders at all levels of the organization. Education •    Individuals pursuing Master’s in computer science, Engineering, Applied Mathematics/ Statistics/Data Science, or any quantitative field with a focus on AI/ML in both supervised and unsupervised machine learning techniques and algorithms.

Data Scientist Intern

Amazon

Hyderabad, Telangana, India

+1 more

Posted: 2 years ago

Seeking motivated data scientists with leadership skills to develop and run analytical models. Background in Computer Science, Engineering, or Operations Research. Use data analysis to improve customer experience and guide business decisions. Opportunity to work at Bangalore, Hyderabad, or Chennai locations.

Intern Data Scientist

Ericsson

Chennai, Tamil Nadu, India

Posted: 5 months ago

Description Join our Team About this opportunity: We are seeking a versatile Data Scientist to join our dynamic team at Ericsson. You will play a pivotal role in harnessing machine learning solutions to solve complex business problems. Predicated on scientific methods, process-driven systems, you will be the driving force behind Ericsson's applied analytics. You will be expected to understand classical and advanced machine learning concepts and apply this knowledge practically to fulfil customer requirements. What you will do: Participate in mapping requirements to implementation - Analysing, coordinating, prioritizing, and optimizing requirements. Ensuring implementation even with constraints. Work with data and develop predictive models, recommendation engines, anomaly detection systems, statistical models, deep learning models, and other machine learning systems Good understanding of machine learning concepts and programming languages like Python, Pyspark, SQL etc. based on latest market trends Ability to work within constraints and timelines and follow the delivery standards and processes defined within Ericsson Good understanding and implementation knowhow of on-premise and well as on-cloud machine learning solutions The skills you bring: - Business Understanding. - Artificial Intelligence Systems. - Software Engineering. - Data Management. - Ericsson Business Intelligence and Analytics Competence. - Open-Source Programming Languages. - Data Preprocessing. - Statistics. - Cloud Development. - Machine Learning Algorithms.