Connect with us

Jobs & Careers

10 Surprising Things You Can Do with Python’s datetime Module

Published

on


10 Surprising Things You Can Do with Python's datetime Module
Image by Author | ChatGPT

 

Introduction

 
Python’s built-in datetime module can easily be considered the go-to library for handling date and time formatting and manipulation in the ecosystem. Most Python coders are familiar with creating datetime objects, formatting them into strings, and performing basic arithmetic. However, this powerful module, sometimes alongside related libraries such as calendar, offers a ton more functionality beyond the basics that can solve complex date and time-related problems with surprising ease.

This article looks at 10 useful — and perhaps surprising — things you can accomplish with Python’s datetime module. From navigating timezones to calculating specific weekday occurrences, these examples will demonstrate the versatility of Python’s date and time toolkit.

 

1. Finding the Day of the Week

 
Beyond just knowing the date, you often need to know the day of the week. The datetime module makes this trivial. Every datetime object has a weekday() method, which returns the day of the week as an integer (Monday is 0, Sunday is 6), and a strftime() method, which can format the date to show the full day name.

import datetime

# Pick a date
today = datetime.date(2025, 7, 10)

# Get the day of the week (Monday is 0)
day_of_week_num = today.weekday()
print(f"Day of the week (numeric): {day_of_week_num}")

# Get the full name of the day
day_name = some_date.strftime("%A")
print(f"The date {today} is a {day_name}")

 

Output:

The date 2025-07-10 is a Thursday

 

2. Calculating the Time Until a Future Event

 
Ever needed a simple countdown timer? With datetime, you can easily calculate the time remaining until a specific future date and time. By subtracting the current datetime from a future one, you get a timedelta object that represents the difference.

import datetime

# Define a future event
new_year_2050 = datetime.datetime(2050, 1, 1, 0, 0, 0)

# Get the current time
now = datetime.datetime.now()

# Calculate the difference
time_left = new_year_2050 - now

print(f"Time left until New Year 2050: {time_left}")

 

Output:

Time left until New Year 2050: 8940 days, 16:05:52.120836

 

3. Working with Timezones

 
Handling timezones is tricky. A naive datetime object has no timezone data, while an aware object does possess this data. Using the pytz library (or the built-in zoneinfo in Python 3.9+) makes working with timezones manageable.

For instance, you can use one timezone’s time as a base for conversion to another timezone like this:

import datetime
from pytz import timezone

# Create a timezone-aware datetime for New York
nyc_tz = timezone('America/New_York')
nyc_time = datetime.datetime.now(nyc_tz)
print(f"New York Time: {nyc_time}")

# Convert it to another timezone
london_tz = timezone('Europe/London')
london_time = nyc_time.astimezone(london_tz)
print(f"London Time: {london_time}")

 

Output:

New York Time: 2025-07-10 07:57:53.900220-04:00
London Time: 2025-07-10 12:57:53.900220+01:00

 

4. Getting the Last Day of a Month

 
Figuring out the last day of a month is not straightforward since months have different numbers of days. You could write logic to handle 30/31 days along with February (don’t forget about leap years!), or you could use a clever trick with datetime and timedelta. The strategy is to find the first day of the next month and then subtract one day.

import datetime

def get_last_day_of_month(year, month):
    # Handle month rollover for December -> January
    if month == 12:
        next_month_first_day = datetime.date(year + 1, 1, 1)
    else:
        next_month_first_day = datetime.date(year, month + 1, 1)
    
    # Subtract one day to get the last day of the current month
    return next_month_first_day - datetime.timedelta(days=1)

# Example: Get the last day of February 2024 (a leap year)
last_day = get_last_day_of_month(2024, 2)
print(f"The last day of February 2024 is: {last_day}")

 

Output:

The last day of February 2024 is: 2024-02-29

 

5. Calculating Your Precise Age

 
You can use datetime to calculate someone’s age down to the day. The logic involves subtracting the birthdate from the current date and then performing a small adjustment to account for whether the person’s birthday has occurred yet this year.

import datetime

def calculate_age(birthdate):
    today = datetime.date.today()
    age = today.year - birthdate.year - ((today.month, today.day) < (birthdate.month, birthdate.day))
    return age

# Example usage
picasso_birthdate = datetime.date(1881, 10, 25)
picasso_age = calculate_age(picasso_birthdate)
print(f"If alive today, Pablo Picasso would be {picasso_age} years old.")

 

Output:

If alive today, Pablo Picasso would be 143 years old.

 

6. Iterating Through a Range of Dates

 
Sometimes you need to perform an operation for every day within a specific date range. You can easily loop through dates by starting with a date object and repeatedly adding a timedelta of one day until you reach the end date.

import datetime

start_date = datetime.date(2025, 1, 1)
end_date = datetime.date(2025, 1, 7)
day_delta = datetime.timedelta(days=1)

current_date = start_date
while current_date <= end_date:
    print(current_date.strftime('%Y-%m-%d, %A'))
    current_date += day_delta

 

Output:

2025-01-01, Wednesday
2025-01-02, Thursday
2025-01-03, Friday
2025-01-04, Saturday
2025-01-05, Sunday
2025-01-06, Monday
2025-01-07, Tuesday

 

7. Parsing Dates from Non-Standard String Formats

 
The strptime() function is useful for converting strings to datetime objects. It is incredibly flexible and can handle a wide variety of formats by using specific format codes. This is essential when dealing with data from different sources that may not use a standard ISO format.

import datetime

date_string_1 = "July 4, 1776"
date_string_2 = "1867-07-01 14:30:00"

# Parse the first string format
dt_object_1 = datetime.datetime.strptime(date_string_1, "%B %d, %Y")
print(f"Parsed object 1: {dt_object_1}")

# Parse the second string format
dt_object_2 = datetime.datetime.strptime(date_string_2, "%Y-%m-%d %H:%M:%S")
print(f"Parsed object 2: {dt_object_2}")

 

Output:

Parsed object 1: 1776-07-04 00:00:00
Parsed object 2: 1867-07-01 14:30:00

 

8. Finding the Nth Weekday of a Month

 
Do you want to know the date of the third Thursday in November? The calendar module can be used alongside datetime to solve this. The monthcalendar() function returns a matrix representing the weeks of a month, which you can then parse.

import calendar

# calendar.weekday() Monday is 0 and Sunday is 6
# calendar.Thursday is 3
cal = calendar.Calendar()

# Get a matrix of weeks for November 2025
month_matrix = cal.monthdatescalendar(2025, 11)

# Find the third Thursday
third_thursday = [week[calendar.THURSDAY] for week in month_matrix if week[calendar.THURSDAY].month == 11][2]

print(f"The third Thursday of Nov 2025 is: {third_thursday}")

 

Output:

The third Thursday of Nov 2025 is: 2025-11-20

 

9. Getting the ISO Week Number

 
The ISO 8601 standard defines a system for week numbering where a week starts on a Monday. The isocalendar() method returns a tuple containing the ISO year, week number, and weekday for a given date.

Note that the date below is a Thursday, and so should result in a day of the week of 4. It should also be the 28th week of the year.

import datetime

d = datetime.date(2025, 7, 10)
iso_cal = d.isocalendar()

print(f"Date: {d}")
print(f"ISO Year: {iso_cal[0]}")
print(f"ISO Week Number: {iso_cal[1]}")
print(f"ISO Weekday: {iso_cal[2]}")

 

Output:

Date: 2025-07-10
ISO Year: 2025
ISO Week Number: 28
ISO Weekday: 4

 

10. Adding or Subtracting Business Days

 
Calculating future dates while skipping weekends is a common business requirement. While datetime doesn’t have a built-in function for this, you can write a simple helper function using timedelta and the weekday() method.

import datetime

def add_business_days(start_date, num_days):
    current_date = start_date
    while num_days > 0:
        current_date += datetime.timedelta(days=1)
        # weekday() returns 5 for Saturday and 6 for Sunday
        if current_date.weekday() < 5:
            num_days -= 1
    return current_date

start = datetime.date(2025, 7, 10) # A Thursday
end = add_business_days(start, 13)

print(f"13 business days after {start} is {end}")

 

13 business days after 2025-07-10 is 2025-07-29

 

Wrapping Up

 
Python’s datetime module is more than just a simple tool for storing dates. It provides a flexible and useful set of tools for handling almost any time-related logic imaginable. By understanding its core components — date, time, datetime, and timedelta — and combining them with the calendar module or external libraries like pytz, you can solve complex real-world problems efficiently and accurately.

Don’t forget to check out the datetime module’s documentation for more. You might be surprised at what you can accomplish.
 
 

Matthew Mayo (@mattmayo13) holds a master’s degree in computer science and a graduate diploma in data mining. As managing editor of KDnuggets & Statology, and contributing editor at Machine Learning Mastery, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, language models, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.





Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Jobs & Careers

Generative AI: A Self-Study Roadmap

Published

on


Generative AI: A Self-Study RoadmapGenerative AI: A Self-Study Roadmap
Image by Author | ChatGPT

 

Introduction

 
The explosion of generative AI has transformed how we think about artificial intelligence. What started with curiosity about GPT-3 has evolved into a business necessity, with companies across industries racing to integrate text generation, image creation, and code synthesis into their products and workflows.

For developers and data practitioners, this shift presents both opportunity and challenge. Traditional machine learning skills provide a foundation, but generative AI engineering demands an entirely different approach—one that emphasizes working with pre-trained foundation models rather than training from scratch, designing systems around probabilistic outputs rather than deterministic logic, and building applications that create rather than classify.

This roadmap provides a structured path to develop generative AI expertise independently. You’ll learn to work with large language models, implement retrieval-augmented generation systems, and deploy production-ready generative applications. The focus remains practical: building skills through hands-on projects that demonstrate your capabilities to employers and clients.

 

Part 1: Understanding Generative AI Fundamentals

 

What Makes Generative AI Different

Generative AI represents a shift from pattern recognition to content creation. Traditional machine learning systems excel at classification, prediction, and optimization—they analyze existing data to make decisions about new inputs. Generative systems create new content: text that reads naturally, images that capture specific styles, code that solves programming problems.

This difference shapes everything about how you work with these systems. Instead of collecting labeled datasets and training models, you work with foundation models that already understand language, images, or code. Instead of optimizing for accuracy metrics, you evaluate creativity, coherence, and usefulness. Instead of deploying deterministic systems, you build applications that produce different outputs each time they run.

Foundation models—large neural networks trained on vast datasets—serve as the building blocks for generative AI applications. These models exhibit emergent capabilities that their creators didn’t explicitly program. GPT-4 can write poetry despite never being specifically trained on poetry datasets. DALL-E can combine concepts it has never seen together, creating images of “a robot painting a sunset in the style of Van Gogh.”

 

Essential Prerequisites

Building generative AI applications requires comfort with Python programming and basic machine learning concepts, but you don’t need deep expertise in neural network architecture or advanced mathematics. Most generative AI work happens at the application layer, using APIs and frameworks rather than implementing algorithms from scratch.

Python Programming: You’ll spend significant time working with APIs, processing text and structured data, and building web applications. Familiarity with libraries like requests, pandas, and Flask or FastAPI will serve you well. Asynchronous programming becomes important when building responsive applications that call multiple AI services.

Machine Learning Concepts: Understanding how neural networks learn helps you work more effectively with foundation models, even though you won’t be training them yourself. Concepts like overfitting, generalization, and evaluation metrics translate directly to generative AI, though the specific metrics differ.

Probability and Statistics: Generative models are probabilistic systems. Understanding concepts like probability distributions, sampling, and uncertainty helps you design better prompts, interpret model outputs, and build robust applications.

 

Large Language Models

Large language models power most current generative AI applications. Built on transformer architecture, these models understand and generate human language with remarkable fluency. Modern LLMs like GPT-4, Claude, and Gemini demonstrate capabilities that extend far beyond text generation. They can analyze code, solve mathematical problems, engage in complex reasoning, and even generate structured data in specific formats.

 

Part 2: The GenAI Engineering Skill Stack

 

Working with Foundation Models

Modern generative AI development centers around foundation models accessed through APIs. This API-first approach offers several advantages: you get access to cutting-edge capabilities without managing infrastructure, you can experiment with different models quickly, and you can focus on application logic rather than model implementation.

Understanding Model Capabilities: Each foundation model excels in different areas. GPT-4 handles complex reasoning and code generation exceptionally well. Claude shows strength in long-form writing and analysis. Gemini integrates multimodal capabilities seamlessly. Learning each model’s strengths helps you select the right tool for specific tasks.

Cost Optimization and Token Management: Foundation model APIs charge based on token usage, making cost optimization essential for production applications. Effective strategies include caching common responses to avoid repeated API calls, using smaller models for simpler tasks like classification or short responses, optimizing prompt length without sacrificing quality, and implementing smart retry logic that avoids unnecessary API calls. Understanding how different models tokenize text helps you estimate costs accurately and design efficient prompting strategies.

Quality Evaluation and Testing: Unlike traditional ML models with clear accuracy metrics, evaluating generative AI requires more sophisticated approaches. Automated metrics like BLEU and ROUGE provide baseline measurements for text quality, but human evaluation remains essential for assessing creativity, relevance, and safety. Build custom evaluation frameworks that include test sets representing your specific use case, clear criteria for success (relevance, accuracy, style consistency), both automated and human evaluation pipelines, and A/B testing capabilities for comparing different approaches.

 

Prompt Engineering Excellence

Prompt engineering transforms generative AI from impressive demo to practical tool. Well-designed prompts consistently produce useful outputs, while poor prompts lead to inconsistent, irrelevant, or potentially harmful results.

Systematic Design Methodology: Effective prompt engineering follows a structured approach. Start with clear objectives—what specific output do you need? Define success criteria—how will you know when the prompt works well? Design iteratively—test variations and measure results systematically. Consider a content summarization task: an engineered prompt specifies length requirements, target audience, key points to emphasize, and output format, producing dramatically better results than “Summarize this article.”

Advanced Techniques: Chain-of-thought prompting encourages models to show their reasoning process, often improving accuracy on complex problems. Few-shot learning provides examples that guide the model toward desired outputs. Constitutional AI techniques help models self-correct problematic responses. These techniques often combine effectively—a complex analysis task might use few-shot examples to demonstrate reasoning style, chain-of-thought prompting to encourage step-by-step thinking, and constitutional principles to ensure balanced analysis.

Dynamic Prompt Systems: Production applications rarely use static prompts. Dynamic systems adapt prompts based on user context, previous interactions, and specific requirements through template systems that insert relevant information, conditional logic that adjusts prompting strategies, and feedback loops that improve prompts based on user satisfaction.

 

Retrieval-Augmented Generation (RAG) Systems

RAG addresses one of the biggest limitations of foundation models: their knowledge cutoff dates and lack of domain-specific information. By combining pre-trained models with external knowledge sources, RAG systems provide accurate, up-to-date information while maintaining the natural language capabilities of foundation models.

Architecture Patterns: Simple RAG systems retrieve relevant documents and include them in prompts for context. Advanced RAG implementations use multiple retrieval steps, rerank results for relevance, and generate follow-up queries to gather comprehensive information. The choice depends on your requirements—simple RAG works well for focused knowledge bases, while advanced RAG handles complex queries across diverse sources.

Vector Databases and Embedding Strategies: RAG systems rely on semantic search to find relevant information, requiring documents converted into vector embeddings that capture meaning rather than keywords. Vector database selection affects both performance and cost: Pinecone offers managed hosting with excellent performance for production applications; Chroma focuses on simplicity and works well for local development and prototyping; Weaviate provides rich querying capabilities and good performance for complex applications; FAISS offers high-performance similarity search when you can manage your own infrastructure.

Document Processing: The quality of your RAG system depends heavily on how you process and chunk documents. Better strategies consider document structure, maintain semantic coherence, and optimize chunk size for your specific use case. Preprocessing steps like cleaning formatting, extracting metadata, and creating document summaries improve retrieval accuracy.

 

Part 3: Tools and Implementation Framework

 

Essential GenAI Development Tools

LangChain and LangGraph provide frameworks for building complex generative AI applications. LangChain simplifies common patterns like prompt templates, output parsing, and chain composition. LangGraph extends this with support for complex workflows that include branching, loops, and conditional logic. These frameworks excel when building applications that combine multiple AI operations, like a document analysis application that orchestrates loading, chunking, embedding, retrieval, and summarization.

Hugging Face Ecosystem offers comprehensive tools for generative AI development. The model hub provides access to thousands of pre-trained models. Transformers library enables local model inference. Spaces allows easy deployment and sharing of applications. For many projects, Hugging Face provides everything needed for development and deployment, particularly for applications using open-source models.

Vector Database Solutions store and search the embeddings that power RAG systems. Choose based on your scale, budget, and feature requirements—managed solutions like Pinecone for production applications, local options like Chroma for development and prototyping, or self-managed solutions like FAISS for high-performance custom implementations.

 

Building Production GenAI Systems

API Design for Generative Applications: Generative AI applications require different API design patterns than traditional web services. Streaming responses improve user experience for long-form generation, allowing users to see content as it’s generated. Async processing handles variable generation times without blocking other operations. Caching reduces costs and improves response times for repeated requests. Consider implementing progressive enhancement where initial responses appear quickly, followed by refinements and additional information.

Handling Non-Deterministic Outputs: Unlike traditional software, generative AI produces different outputs for identical inputs. This requires new approaches to testing, debugging, and quality assurance. Implement output validation that checks for format compliance, content safety, and relevance. Design user interfaces that set appropriate expectations about AI-generated content. Version control becomes more complex—consider storing input prompts, model parameters, and generation timestamps to enable reproduction of specific outputs when needed.

Content Safety and Filtering: Production generative AI systems must handle potentially harmful outputs. Implement multiple layers of safety: prompt design that discourages harmful outputs, output filtering that catches problematic content using specialized safety models, and user feedback mechanisms that help identify issues. Monitor for prompt injection attempts and unusual usage patterns that might indicate misuse.

 

Part 4: Hands-On Project Portfolio

 
Building expertise in generative AI requires hands-on experience with increasingly complex projects. Each project should demonstrate specific capabilities while building toward more sophisticated applications.

 

Project 1: Smart Chatbot with Custom Knowledge

Start with a conversational AI that can answer questions about a specific domain using RAG. This project introduces prompt engineering, document processing, vector search, and conversation management.

Implementation focus: Design system prompts that establish the bot’s personality and capabilities. Implement basic RAG with a small document collection. Build a simple web interface for testing. Add conversation memory so the bot remembers context within sessions.

Key learning outcomes: Understanding how to combine foundation models with external knowledge. Experience with vector embeddings and semantic search. Practice with conversation design and user experience considerations.

 

Project 2: Content Generation Pipeline

Build a system that creates structured content based on user requirements. For example, a marketing content generator that produces blog posts, social media content, and email campaigns based on product information and target audience.

Implementation focus: Design template systems that guide generation while allowing creativity. Implement multi-step workflows that research, outline, write, and refine content. Add quality evaluation and revision loops that assess content against multiple criteria. Include A/B testing capabilities for different generation strategies.

Key learning outcomes: Experience with complex prompt engineering and template systems. Understanding of content evaluation and iterative improvement. Practice with production deployment and user feedback integration.

 

Project 3: Multimodal AI Assistant

Create an application that processes both text and images, generating responses that might include text descriptions, image modifications, or new image creation. This could be a design assistant that helps users create and modify visual content.

Implementation focus: Integrate multiple foundation models for different modalities. Design workflows that combine text and image processing. Implement user interfaces that handle multiple content types. Add collaborative features that let users refine outputs iteratively.

Key learning outcomes: Understanding multimodal AI capabilities and limitations. Experience with complex system integration. Practice with user interface design for AI-powered tools.

 

Documentation and Deployment

Each project requires comprehensive documentation that demonstrates your thinking process and technical decisions. Include architecture overviews explaining system design choices, prompt engineering decisions and iterations, and setup instructions enabling others to reproduce your work. Deploy at least one project to a publicly accessible endpoint—this demonstrates your ability to handle the full development lifecycle from concept to production.

 

Part 5: Advanced Considerations

 

Fine-Tuning and Model Customization

While foundation models provide impressive capabilities out of the box, some applications benefit from customization to specific domains or tasks. Consider fine-tuning when you have high-quality, domain-specific data that foundation models don’t handle well—specialized technical writing, industry-specific terminology, or unique output formats requiring consistent structure.

Parameter-Efficient Techniques: Modern fine-tuning often uses methods like LoRA (Low-Rank Adaptation) that modify only a small subset of model parameters while keeping the original model frozen. QLoRA extends this with quantization for memory efficiency. These techniques reduce computational requirements while maintaining most benefits of full fine-tuning and enable serving multiple specialized models from a single base model.

 

Emerging Patterns

Multimodal Generation combines text, images, audio, and other modalities in single applications. Modern models can generate images from text descriptions, create captions for images, or even generate videos from text prompts. Consider applications that generate illustrated articles, create video content from written scripts, or design marketing materials combining text and images.

Code Generation Beyond Autocomplete extends from simple code completion to full development workflows. Modern AI can understand requirements, design architectures, implement solutions, write tests, and even debug problems. Building applications that assist with complex development tasks requires understanding both coding patterns and software engineering practices.

 

Part 6: Responsible GenAI Development

 

Understanding Limitations and Risks

Hallucination Detection: Foundation models sometimes generate confident-sounding but incorrect information. Mitigation strategies include designing prompts that encourage citing sources, implementing fact-checking workflows that verify important claims, building user interfaces that communicate uncertainty appropriately, and using multiple models to cross-check important information.

Bias in Generative Outputs: Foundation models reflect biases present in their training data, potentially perpetuating stereotypes or unfair treatment. Address bias through diverse evaluation datasets that test for various forms of unfairness, prompt engineering techniques that encourage balanced representation, and ongoing monitoring that tracks outputs for biased patterns.

 

Building Ethical GenAI Systems

Human Oversight: Effective generative AI applications include appropriate human oversight, particularly for high-stakes decisions or creative work where human judgment adds value. Design oversight mechanisms that enhance rather than hinder productivity—smart routing that escalates only cases requiring human attention, AI assistance that helps humans make better decisions, and feedback loops that improve AI performance over time.

Transparency: Users benefit from understanding how AI systems make decisions and generate content. Focus on communicating relevant information about AI capabilities, limitations, and reasoning behind specific outputs without exposing technical details that users won’t understand.

 

Part 7: Staying Current in the Fast-Moving GenAI Space

The generative AI field evolves rapidly, with new models, techniques, and applications emerging regularly. Follow research labs like OpenAI, Anthropic, Google DeepMind, and Meta AI for breakthrough announcements. Subscribe to newsletters like The Batch from deeplearning.ai and engage with practitioner communities on Discord servers focused on AI development and Reddit’s MachineLearning communities.

Continuous Learning Strategy: Stay informed about developments across the field while focusing deeper learning on areas most relevant to your career goals. Follow model releases from major labs and test new capabilities systematically to stay current with rapidly evolving capabilities. Regular hands-on experimentation helps you understand new capabilities and identify practical applications. Set aside time for exploring new models, testing emerging techniques, and building small proof-of-concept applications.

Contributing to Open Source: Contributing to generative AI open-source projects provides deep learning opportunities while building professional reputation. Start with small contributions—documentation improvements, bug fixes, or example applications. Consider larger contributions like new features or entirely new projects that address unmet community needs.

 

Resources for Continued Learning

 
Free Resources:

  1. Hugging Face Course: Comprehensive introduction to transformer models and practical applications
  2. LangChain Documentation: Detailed guides for building LLM applications
  3. OpenAI Cookbook: Practical examples and best practices for GPT models
  4. Papers with Code: Latest research with implementation examples

 
Paid Resources:

  1. “AI Engineering: Building Applications with Foundation Models” by Chip Huyen: A full-length guide to designing, evaluating, and deploying foundation model applications. Also available: a shorter, free overview titled “Building LLM-Powered Applications”, which introduces many of the core ideas. 
  2. Coursera’s “Generative AI with Large Language Models”: Structured curriculum covering theory and practice
  3. DeepLearning.AI’s Short Courses: Focused tutorials on specific techniques and tools

 

Conclusion

 
The path from curious observer to skilled generative AI engineer involves developing both technical capabilities and practical experience building systems that create rather than classify. Starting with foundation model APIs and prompt engineering, you’ll learn to work with the building blocks of modern generative AI. RAG systems teach you to combine pre-trained capabilities with external knowledge. Production deployment shows you how to handle the unique challenges of non-deterministic systems.

The field continues evolving rapidly, but the approaches covered here—systematic prompt engineering, robust system design, careful evaluation, and responsible development practices—remain relevant as new capabilities emerge. Your portfolio of projects provides concrete evidence of your skills while your understanding of underlying principles prepares you for future developments.

The generative AI field rewards both technical skill and creative thinking. Your ability to combine foundation models with domain expertise, user experience design, and system engineering will determine your success in this exciting and rapidly evolving field. Continue building, experimenting, and sharing your work with the community as you develop expertise in creating AI systems that genuinely augment human capabilities.
 
 

Born in India and raised in Japan, Vinod brings a global perspective to data science and machine learning education. He bridges the gap between emerging AI technologies and practical implementation for working professionals. Vinod focuses on creating accessible learning pathways for complex topics like agentic AI, performance optimization, and AI engineering. He focuses on practical machine learning implementations and mentoring the next generation of data professionals through live sessions and personalized guidance.



Source link

Continue Reading

Jobs & Careers

Kaggle CLI Cheat Sheet – KDnuggets

Published

on



Image by Author

 

The Kaggle CLI (Command Line Interface) allows you to interact with Kaggle’s datasets, competitions, notebooks, and models directly from your terminal. This is useful for automating downloads, submissions, and dataset management without needing a web browser. Most of my GitHub Action workflows use Kaggle CLI for downloading or pushing datasets, as it is the fastest and most efficient way.

 

1. Installation & Setup

 
Make sure you have Python 3.10+ installed. Then, run the following command in your terminal to install the official Kaggle API:

To obtain your Kaggle credentials, download the kaggle.json file from your Kaggle account settings by clicking “Create New Token.”  

Next, set the environment variables in your local system:  

  • KAGGLE_USERNAME=  
  • KAGGLE_API_KEY=

 

2. Competitions

 
Kaggle Competitions are hosted challenges where you can solve machine learning problems, download data, submit predictions, and see your results on the leaderboard. 

The CLI helps you automate everything: browsing competitions, downloading files, submitting solutions, and more.

 

List Competitions

kaggle competitions list -s 

Shows a list of Kaggle competitions, optionally filtered by a search term. Useful for discovering new challenges to join.

 

List Competition Files

kaggle competitions files 

Displays all files available for a specific competition, so you know what data is provided.

 

Download Competition Files

kaggle competitions download  [-f ] [-p ]

Downloads all or specific files from a competition to your local machine. Use -f to specify a file, -p to set the download folder.

 

Submit to a Competition

kaggle competitions submit  -f  -m ""

Upload your solution file to a competition with an optional message describing your submission.

 

List Your Submissions

kaggle competitions submissions 

Shows all your previous submissions for a competition, including scores and timestamps.

 

View Leaderboard

kaggle competitions leaderboard  [-s]

Displays the current leaderboard for a competition. Use -s to show only the top entries.

 

3. Datasets

 
Kaggle Datasets are collections of data shared by the community. The dataset CLI commands help you find, download, and upload datasets, as well as manage dataset versions.

 

List Datasets

Finds datasets on Kaggle, optionally filtered by a search term. Great for discovering data for your projects.

 

List Files in a Dataset

Shows all files included in a specific dataset, so you can see what’s available before downloading.

 

Download Dataset Files

kaggle datasets download / [-f ] [--unzip]

Downloads all or specific files from a dataset. Use –unzip to automatically extract zipped files.

 

Initialize Dataset Metadata

Creates a metadata file in a folder, preparing it for dataset creation or versioning.

 

Create a New Dataset

kaggle datasets create -p 

Uploads a new dataset from a folder containing your data and metadata.

 

Create a New Dataset Version

kaggle datasets version -p  -m ""

Uploads a new version of an existing dataset, with a message describing the changes.

 

4. Notebooks

 
Kaggle Notebooks are executable code snippets or notebooks. The CLI allows you to list, download, upload, and check the status of these notebooks, which is useful for sharing or automating analysis.

 

List Kernels

Finds public Kaggle notebooks (kernels) matching your search term.

 

Get Kernel Code

Downloads the code for a specific kernel to your local machine.

 

Initialize Kernel Metadata

Creates a metadata file in a folder, preparing it for kernel creation or updates.

 

Update Kernel

Uploads new code and runs the kernel, updating it on Kaggle.

 

Get Kernel Output

kaggle kernels output / -p 

Downloads the output files generated by a kernel run.

 

Check Kernel Status

Shows the current status (e.g., running, complete, failed) of a kernel.

 

5. Models

 
Kaggle Models are versioned machine learning models you can share, reuse, or deploy. The CLI helps manage these models, from listing and downloading to creating and updating them.

 

List Models

Finds public models on Kaggle matching your search term.

 

Get a Model

Downloads a model and its metadata to your local machine.

 

Initialize Model Metadata

Creates a metadata file in a folder, preparing it for model creation.

 

Create a New Model

Uploads a new model to Kaggle from your local folder.

 

Update a Model

Uploads a new version of an existing model.

 

Delete a Model

Removes a model from Kaggle.

 

6. Config

 
Kaggle CLI configuration commands control default behaviors, such as download locations and your default competition. Adjust these settings to make your workflow smoother.

 

View Config

Displays your current Kaggle CLI configuration settings (e.g., default competition, download path).

 

Set Config

Sets a configuration value, such as default competition or download path.

 

Unset Config

Removes a configuration value, reverting to default behavior.

 

7. Tips

 

  • Use -h or –help after any command for detailed options and usage
  • Use -v for CSV output, -q for quiet mode
  • You must accept competition rules on the Kaggle website before downloading or submitting to competitions

 
 

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master’s degree in technology management and a bachelor’s degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.



Source link

Continue Reading

Jobs & Careers

Automate SQL Workflows with n8n: Scheduled Database Reports via Email

Published

on


Automate SQL Workflows with n8n: Scheduled Database Reports via Email
Image by Author | ChatGPT

 

The Hidden Cost of Routine SQL Reporting

 
Data teams across organizations face the same recurring challenge: stakeholders require regular reports, but manual SQL reporting consumes valuable time that could be spent on analysis. The process remains consistent regardless of company size — connect to the database, execute queries, format results, and distribute findings to decision-makers.

Data professionals routinely handle reporting tasks that don’t require advanced statistical knowledge or domain expertise, yet they consume significant time through repetitive execution of the same queries and formatting procedures.

This workflow addresses a fundamental efficiency problem: transforming one-time setup into ongoing automated delivery of professional reports directly to stakeholder inboxes.

 

The Solution: A 4-Node Automated Reporting Pipeline

 
Building on our previous n8n exploration, this workflow tackles a different automation challenge: scheduled SQL reporting. While our first tutorial focused on data quality analysis, this one demonstrates how n8n handles database integration, recurring schedules, and email distribution.

Unlike writing standalone Python scripts for reporting, n8n workflows are visual, reusable, and easy to modify. You can connect databases, perform transformations, run analyses, and deliver results — all without switching between different tools or environments. Each workflow consists of “nodes” that represent different actions, connected together to create an automated pipeline.

Our automated SQL reporter consists of four connected nodes that transform manual reporting into a hands-off process:

 
Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation
 

  1. Schedule Trigger – Runs every Monday at 9 AM
  2. PostgreSQL Node – Executes sales query against database
  3. Code Node – Transforms raw data into formatted HTML report
  4. Send Email Node – Delivers professional report to stakeholders

 

Building the Workflow: Step-by-Step Implementation

 

Prerequisites

 

Step 1: Set Up Your PostgreSQL Database

We’ll create a realistic sales database using Supabase for this tutorial. Supabase is a cloud-based PostgreSQL platform that provides managed databases with built-in APIs and authentication—making it ideal for rapid prototyping and production applications. While this tutorial uses Supabase for convenience, the n8n workflow connects to any PostgreSQL database, including AWS RDS, Google Cloud SQL, or your organization’s existing database infrastructure.

Create Supabase Account:

  1. Visit supabase.com and sign up for free
  2. Create new project – choose any name and region
  3. Wait for setup – takes about 2 minutes for database provisioning
  4. View your connection details from the Settings > Database page (or the “connect” button on the main page)

Load Sample Data:

Navigate to the SQL Editor in Supabase and run this setup script to create our sales database tables and populate them with sample data:

-- Create employees table
CREATE TABLE employees (
    emp_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50)
);

-- Create sales table
CREATE TABLE sales (
    sale_id SERIAL PRIMARY KEY,
    emp_id INTEGER REFERENCES employees(emp_id),
    sale_amount DECIMAL(10,2),
    sale_date DATE
);

-- Insert sample employees
INSERT INTO employees (first_name, last_name, department) VALUES
('Mike', 'Johnson', 'Sales'),
('John', 'Doe', 'Sales'),
('Tom', 'Wilson', 'Sales'),
('Sarah', 'Chen', 'Marketing');

-- Insert recent sales data
INSERT INTO sales (emp_id, sale_amount, sale_date) VALUES
(1, 2500.00, CURRENT_DATE - 2),
(1, 1550.00, CURRENT_DATE - 5),
(2, 890.00, CURRENT_DATE - 1),
(2, 1500.00, CURRENT_DATE - 4),
(3, 3200.00, CURRENT_DATE - 3),
(4, 1200.00, CURRENT_DATE - 6);

 

Paste this entire script into the SQL Editor and click the “Run” button in the bottom-right corner. You should see “Success. No rows returned” confirming that your tables and sample data have been created successfully.

 
Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation
 

Test Your Connection:

Within the same SQL Editor, run a fresh query to verify everything works: SELECT COUNT(*) FROM employees;

You should see 4 employees in the results.

 

Step 2: Configure Gmail for Automated Sending

Enable App Password:

  1. Turn on 2-step verification in your Google Account settings
  2. Generate app password – go to Security > App passwords
  3. Select “Mail” and “Other” – name it “n8n reporting”
  4. Copy the 16-character password – you’ll need this for n8n

 

Step 3: Import and Configure the Workflow

Import the Template:

  1. Download the workflow file
  2. Open n8n and click “Import from File”
  3. Select the downloaded file – all four nodes appear automatically
  4. Save the workflow as “Automated SQL Reporting”

The imported workflow contains four connected nodes with all the complex SQL and formatting code already configured.

Configure Database Connection:

  1. Click the PostgreSQL node
  2. Get your connection details from Supabase by clicking the “Connect” button on your main page. For n8n integration, use the “Transaction pooler” connection string as it’s optimized for automated workflows:

 
Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation
 

  1. Create new credential with your Supabase details:
    • Host: [your-project].supabase.com
    • Database: postgres
    • User: postgres…..
    • Password: [from Supabase settings]
    • Port: 6543
    • SSL: Enable
  2. Test connection – you should see a green success message

Configure Email Settings:

  1. Click the Send Email node
  2. Create SMTP credential:
    • Host: smtp.gmail.com
    • Port: 587
    • User: your-email@gmail.com
    • Password: [your app password]
    • Secure: Enable STARTTLS
  3. Update recipient in the “To Email” field

 

That’s it! The analysis logic automatically adapts to different database schemas, table names, and data types.

 

Step 4: Test and Deploy

  1. Click “Execute Workflow” in the toolbar
  2. Watch each node turn green as it processes
  3. Check your email – you should receive the formatted report
  4. Toggle to “Active” to enable Monday morning automation

Once the setup is complete, you’ll receive automatic weekly reports without any manual intervention.

 

Understanding Your Automated Report

 

Here’s what your stakeholders will receive every Monday:

Email Subject: 📊 Weekly Sales Report – June 27, 2025

Report Content:

  • Clean HTML table with proper styling and borders
  • Summary statistics calculated automatically from SQL results
  • Professional formatting suitable for executive stakeholders
  • Timestamp and metadata for audit trails

Here’s what the final report looks like:

 
Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation
 

The workflow automatically handles all the complex formatting and calculations behind this professional output. Notice how the report includes proper currency formatting, calculated averages, and clean table styling—all generated directly from raw SQL results without any manual intervention. The email arrives with a timestamp, making it easy for stakeholders to track reporting periods and maintain audit trails for decision-making processes.

 

Technical Deep Dive: Understanding the Implementation

 
Schedule Trigger Configuration:

The workflow runs every Monday at 9:00 AM using n8n’s interval scheduling. This timing ensures reports arrive before weekly team meetings.

SQL Query Logic:

The PostgreSQL node executes a sophisticated query with JOINs, date filtering, aggregations, and proper numeric formatting. It automatically:

  • Joins employee and sales tables for complete records
  • Filters data to last 7 days using CURRENT_DATE - INTERVAL '7 days'
  • Calculates total sales, revenue, and averages per person
  • Orders results by revenue for business prioritization

HTML Generation Logic:

The Code node transforms SQL results into professional HTML using JavaScript. It iterates through query results, builds styled HTML tables with consistent formatting, calculates summary statistics, and adds professional touches like emojis and timestamps.

Email Delivery:

The Send Email node uses Gmail’s SMTP service with proper authentication and HTML rendering support.

 

Testing with Different Scenarios

 
To see how the workflow handles varying data patterns, try these modifications:

  1. Different Time Periods: Change INTERVAL '7 days' to INTERVAL '30 days' for monthly reports
  2. Department Filtering: Add WHERE e.department="Sales" for team-specific reports
  3. Different Metrics: Modify SELECT clause to include product categories or customer segments

Based on your business needs, you can determine next steps: weekly reports work well for operational teams, monthly reports suit strategic planning, quarterly reports serve executive dashboards, and daily reports help with real-time monitoring. The workflow adapts automatically to any SQL structure, allowing you to quickly create multiple reporting pipelines for different stakeholders.

 

Next Steps

 

1. Multi-Database Support

Replace the PostgreSQL node with MySQL, SQL Server, or any supported database. The workflow logic remains identical while connecting to different data sources. This flexibility makes the solution valuable across diverse technology stacks.

 

2. Advanced Scheduling

Modify the Schedule Trigger for different frequencies. Set up daily reports for operational metrics, monthly reports for strategic planning, or quarterly reports for board meetings. Each schedule can target different recipient groups with tailored content.

 

3. Enhanced Formatting

Extend the Code node to include charts and visualizations using Chart.js, conditional formatting based on performance thresholds, or executive summaries with key insights. The HTML output supports rich formatting and embedded graphics.

 

4. Multi-Recipient Distribution

Add logic to send different reports to different stakeholders. Sales managers receive individual team reports, executives receive high-level summaries, and finance teams receive revenue-focused metrics. This targeted approach ensures each audience gets relevant information.

 

Conclusion

 
This automated SQL reporting workflow demonstrates how n8n bridges the gap between data science expertise and operational efficiency. By combining database integration, scheduling, and email automation, you can eliminate routine reporting tasks while delivering professional results to stakeholders.

The workflow’s modular design makes it particularly valuable for data teams managing multiple reporting requirements. You can duplicate the workflow for different databases, modify the SQL queries for various metrics, and adjust the formatting for different audiences—all without writing custom scripts or managing server infrastructure.

Unlike traditional ETL tools that require extensive configuration, n8n’s visual interface makes complex data workflows accessible to both technical and non-technical team members. Your SQL expertise remains the core value, while n8n handles the automation infrastructure, scheduling reliability, and delivery mechanisms.

Most importantly, this approach scales with your organization’s needs. Start with simple weekly reports, then expand to include data visualizations, multi-database queries, or integration with business intelligence platforms. The foundation you build today becomes the automated reporting infrastructure that supports your team’s growth tomorrow.
 
 

Born in India and raised in Japan, Vinod brings a global perspective to data science and machine learning education. He bridges the gap between emerging AI technologies and practical implementation for working professionals. Vinod focuses on creating accessible learning pathways for complex topics like agentic AI, performance optimization, and AI engineering. He focuses on practical machine learning implementations and mentoring the next generation of data professionals through live sessions and personalized guidance.



Source link

Continue Reading

Trending