Jobs & Careers

Google Launches T5Gemma to Reclaim Encoder-Decoder Architecture Benefits

Published

14 hours ago

July 10, 2025

Google has launched T5Gemma, a new collection of encoder-decoder large language models (LLMs) that promise improved quality and inference efficiency compared to their decoder-only counterparts. It is based on the Gemma 2 framework.

Unlike the current trend that favours decoder-only LLMs, T5Gemma revisits the classic encoder-decoder architecture used in models like T5. The company introduced an adaptation technique that converts pretrained decoder-only models into encoder-decoder ones.

“We study a novel problem: adapting pretrained decoder-only LLMs to encoder-decoder, with the goal of leveraging the strengths of both approaches to achieve a more favourable quality efficiency trade-off,” the research paper mentioned.

The researchers further highlight that this adaptation not only enables inheriting the capability of decoder-only LLMs but also reduces the computational demand compared to pretraining from scratch.

“Can we build top-tier encoder-decoder models based on pretrained decoder-only models? We answer this question by exploring model adaptation,” the company explained in the blog post.

T5Gemma includes both newly trained T5-sized models, ranging from small to XL, and adapted Gemma 2 models with 2B and 9B parameters. It also offers unbalanced combinations such as a 9B encoder with a 2B decoder, aimed at optimising performance for tasks where input understanding is more important than output complexity.

According to benchmark results shared by Google, T5Gemma dominates the quality-inference efficiency Pareto frontier. On SuperGLUE and GSM8K, the models outperform comparable decoder-only models in both accuracy and latency. For example, T5Gemma 9B-9B delivered higher GSM8K accuracy than Gemma 2 9B while maintaining similar latency.

The gains extend beyond pretraining. After instruction tuning, T5Gemma models showed dramatic improvements. The 2B-2B model’s MMLU score jumped 12 points, while GSM8K accuracy rose from 58.0% to 70.7%, highlighting the architecture’s responsiveness to fine-tuning.

Google has released a wide range of T5Gemma checkpoints, including pretrained and instruction-tuned variants, with multiple training objectives such as PrefixLM and UL2.

The models are now available on Hugging Face, Kaggle, and Vertex AI for further experimentation and deployment.

Source link

Jobs & Careers

Kaggle CLI Cheat Sheet – KDnuggets

Published

5 hours ago

July 10, 2025

Abid Ali Awan

Image by Author

The Kaggle CLI (Command Line Interface) allows you to interact with Kaggle’s datasets, competitions, notebooks, and models directly from your terminal. This is useful for automating downloads, submissions, and dataset management without needing a web browser. Most of my GitHub Action workflows use Kaggle CLI for downloading or pushing datasets, as it is the fastest and most efficient way.

1. Installation & Setup

Make sure you have Python 3.10+ installed. Then, run the following command in your terminal to install the official Kaggle API:

To obtain your Kaggle credentials, download the kaggle.json file from your Kaggle account settings by clicking “Create New Token.”

Next, set the environment variables in your local system:

KAGGLE_USERNAME=
KAGGLE_API_KEY=

2. Competitions

Kaggle Competitions are hosted challenges where you can solve machine learning problems, download data, submit predictions, and see your results on the leaderboard.

The CLI helps you automate everything: browsing competitions, downloading files, submitting solutions, and more.

List Competitions

kaggle competitions list -s

Shows a list of Kaggle competitions, optionally filtered by a search term. Useful for discovering new challenges to join.

List Competition Files

kaggle competitions files

Displays all files available for a specific competition, so you know what data is provided.

Download Competition Files

kaggle competitions download  [-f ] [-p ]

Downloads all or specific files from a competition to your local machine. Use -f to specify a file, -p to set the download folder.

Submit to a Competition

kaggle competitions submit  -f  -m ""

Upload your solution file to a competition with an optional message describing your submission.

List Your Submissions

kaggle competitions submissions

Shows all your previous submissions for a competition, including scores and timestamps.

View Leaderboard

kaggle competitions leaderboard  [-s]

Displays the current leaderboard for a competition. Use -s to show only the top entries.

3. Datasets

Kaggle Datasets are collections of data shared by the community. The dataset CLI commands help you find, download, and upload datasets, as well as manage dataset versions.

List Datasets

Finds datasets on Kaggle, optionally filtered by a search term. Great for discovering data for your projects.

List Files in a Dataset

Shows all files included in a specific dataset, so you can see what’s available before downloading.

Download Dataset Files

kaggle datasets download / [-f ] [--unzip]

Downloads all or specific files from a dataset. Use –unzip to automatically extract zipped files.

Initialize Dataset Metadata

Creates a metadata file in a folder, preparing it for dataset creation or versioning.

Create a New Dataset

kaggle datasets create -p

Uploads a new dataset from a folder containing your data and metadata.

Create a New Dataset Version

kaggle datasets version -p  -m ""

Uploads a new version of an existing dataset, with a message describing the changes.

4. Notebooks

Kaggle Notebooks are executable code snippets or notebooks. The CLI allows you to list, download, upload, and check the status of these notebooks, which is useful for sharing or automating analysis.

List Kernels

Finds public Kaggle notebooks (kernels) matching your search term.

Get Kernel Code

Downloads the code for a specific kernel to your local machine.

Initialize Kernel Metadata

Creates a metadata file in a folder, preparing it for kernel creation or updates.

Update Kernel

Uploads new code and runs the kernel, updating it on Kaggle.

Get Kernel Output

kaggle kernels output / -p

Downloads the output files generated by a kernel run.

Check Kernel Status

Shows the current status (e.g., running, complete, failed) of a kernel.

5. Models

Kaggle Models are versioned machine learning models you can share, reuse, or deploy. The CLI helps manage these models, from listing and downloading to creating and updating them.

List Models

Finds public models on Kaggle matching your search term.

Get a Model

Downloads a model and its metadata to your local machine.

Initialize Model Metadata

Creates a metadata file in a folder, preparing it for model creation.

Create a New Model

Uploads a new model to Kaggle from your local folder.

Update a Model

Uploads a new version of an existing model.

Delete a Model

Removes a model from Kaggle.

6. Config

Kaggle CLI configuration commands control default behaviors, such as download locations and your default competition. Adjust these settings to make your workflow smoother.

View Config

Displays your current Kaggle CLI configuration settings (e.g., default competition, download path).

Set Config

Sets a configuration value, such as default competition or download path.

Unset Config

Removes a configuration value, reverting to default behavior.

7. Tips

Use -h or –help after any command for detailed options and usage
Use -v for CSV output, -q for quiet mode
You must accept competition rules on the Kaggle website before downloading or submitting to competitions

Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master’s degree in technology management and a bachelor’s degree in telecommunication engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.

Source link

Jobs & Careers

Automate SQL Workflows with n8n: Scheduled Database Reports via Email

Published

7 hours ago

July 10, 2025

Vinod Chugani

Automate SQL Workflows with n8n: Scheduled Database Reports via Email

Image by Author | ChatGPT

The Hidden Cost of Routine SQL Reporting

Data teams across organizations face the same recurring challenge: stakeholders require regular reports, but manual SQL reporting consumes valuable time that could be spent on analysis. The process remains consistent regardless of company size — connect to the database, execute queries, format results, and distribute findings to decision-makers.

Data professionals routinely handle reporting tasks that don’t require advanced statistical knowledge or domain expertise, yet they consume significant time through repetitive execution of the same queries and formatting procedures.

This workflow addresses a fundamental efficiency problem: transforming one-time setup into ongoing automated delivery of professional reports directly to stakeholder inboxes.

The Solution: A 4-Node Automated Reporting Pipeline

Building on our previous n8n exploration, this workflow tackles a different automation challenge: scheduled SQL reporting. While our first tutorial focused on data quality analysis, this one demonstrates how n8n handles database integration, recurring schedules, and email distribution.

Unlike writing standalone Python scripts for reporting, n8n workflows are visual, reusable, and easy to modify. You can connect databases, perform transformations, run analyses, and deliver results — all without switching between different tools or environments. Each workflow consists of “nodes” that represent different actions, connected together to create an automated pipeline.

Our automated SQL reporter consists of four connected nodes that transform manual reporting into a hands-off process:

Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation

Schedule Trigger – Runs every Monday at 9 AM
PostgreSQL Node – Executes sales query against database
Code Node – Transforms raw data into formatted HTML report
Send Email Node – Delivers professional report to stakeholders

Building the Workflow: Step-by-Step Implementation

Prerequisites

Step 1: Set Up Your PostgreSQL Database

We’ll create a realistic sales database using Supabase for this tutorial. Supabase is a cloud-based PostgreSQL platform that provides managed databases with built-in APIs and authentication—making it ideal for rapid prototyping and production applications. While this tutorial uses Supabase for convenience, the n8n workflow connects to any PostgreSQL database, including AWS RDS, Google Cloud SQL, or your organization’s existing database infrastructure.

Create Supabase Account:

Visit supabase.com and sign up for free
Create new project – choose any name and region
Wait for setup – takes about 2 minutes for database provisioning
View your connection details from the Settings > Database page (or the “connect” button on the main page)

Load Sample Data:

Navigate to the SQL Editor in Supabase and run this setup script to create our sales database tables and populate them with sample data:

-- Create employees table
CREATE TABLE employees (
    emp_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department VARCHAR(50)
);

-- Create sales table
CREATE TABLE sales (
    sale_id SERIAL PRIMARY KEY,
    emp_id INTEGER REFERENCES employees(emp_id),
    sale_amount DECIMAL(10,2),
    sale_date DATE
);

-- Insert sample employees
INSERT INTO employees (first_name, last_name, department) VALUES
('Mike', 'Johnson', 'Sales'),
('John', 'Doe', 'Sales'),
('Tom', 'Wilson', 'Sales'),
('Sarah', 'Chen', 'Marketing');

-- Insert recent sales data
INSERT INTO sales (emp_id, sale_amount, sale_date) VALUES
(1, 2500.00, CURRENT_DATE - 2),
(1, 1550.00, CURRENT_DATE - 5),
(2, 890.00, CURRENT_DATE - 1),
(2, 1500.00, CURRENT_DATE - 4),
(3, 3200.00, CURRENT_DATE - 3),
(4, 1200.00, CURRENT_DATE - 6);

Paste this entire script into the SQL Editor and click the “Run” button in the bottom-right corner. You should see “Success. No rows returned” confirming that your tables and sample data have been created successfully.

Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation

Test Your Connection:

Within the same SQL Editor, run a fresh query to verify everything works: SELECT COUNT(*) FROM employees;

You should see 4 employees in the results.

Step 2: Configure Gmail for Automated Sending

Enable App Password:

Turn on 2-step verification in your Google Account settings
Generate app password – go to Security > App passwords
Select “Mail” and “Other” – name it “n8n reporting”
Copy the 16-character password – you’ll need this for n8n

Step 3: Import and Configure the Workflow

Import the Template:

Download the workflow file
Open n8n and click “Import from File”
Select the downloaded file – all four nodes appear automatically
Save the workflow as “Automated SQL Reporting”

The imported workflow contains four connected nodes with all the complex SQL and formatting code already configured.

Configure Database Connection:

Click the PostgreSQL node
Get your connection details from Supabase by clicking the “Connect” button on your main page. For n8n integration, use the “Transaction pooler” connection string as it’s optimized for automated workflows:

Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation

Create new credential with your Supabase details:
- Host: [your-project].supabase.com
- Database: postgres
- User: postgres…..
- Password: [from Supabase settings]
- Port: 6543
- SSL: Enable
Test connection – you should see a green success message

Configure Email Settings:

Click the Send Email node
Create SMTP credential:
- Host: smtp.gmail.com
- Port: 587
- User: your-email@gmail.com
- Password: [your app password]
- Secure: Enable STARTTLS
Update recipient in the “To Email” field

That’s it! The analysis logic automatically adapts to different database schemas, table names, and data types.

Step 4: Test and Deploy

Click “Execute Workflow” in the toolbar
Watch each node turn green as it processes
Check your email – you should receive the formatted report
Toggle to “Active” to enable Monday morning automation

Once the setup is complete, you’ll receive automatic weekly reports without any manual intervention.

Understanding Your Automated Report

Here’s what your stakeholders will receive every Monday:

Email Subject: 📊 Weekly Sales Report – June 27, 2025

Report Content:

Clean HTML table with proper styling and borders
Summary statistics calculated automatically from SQL results
Professional formatting suitable for executive stakeholders
Timestamp and metadata for audit trails

Here’s what the final report looks like:

Transform SQL Workflows with n8n: Scheduled Database Reports via Email Automation

The workflow automatically handles all the complex formatting and calculations behind this professional output. Notice how the report includes proper currency formatting, calculated averages, and clean table styling—all generated directly from raw SQL results without any manual intervention. The email arrives with a timestamp, making it easy for stakeholders to track reporting periods and maintain audit trails for decision-making processes.

Technical Deep Dive: Understanding the Implementation

Schedule Trigger Configuration:

The workflow runs every Monday at 9:00 AM using n8n’s interval scheduling. This timing ensures reports arrive before weekly team meetings.

SQL Query Logic:

The PostgreSQL node executes a sophisticated query with JOINs, date filtering, aggregations, and proper numeric formatting. It automatically:

Joins employee and sales tables for complete records
Filters data to last 7 days using CURRENT_DATE - INTERVAL '7 days'
Calculates total sales, revenue, and averages per person
Orders results by revenue for business prioritization

HTML Generation Logic:

The Code node transforms SQL results into professional HTML using JavaScript. It iterates through query results, builds styled HTML tables with consistent formatting, calculates summary statistics, and adds professional touches like emojis and timestamps.

Email Delivery:

The Send Email node uses Gmail’s SMTP service with proper authentication and HTML rendering support.

Testing with Different Scenarios

To see how the workflow handles varying data patterns, try these modifications:

Different Time Periods: Change INTERVAL '7 days' to INTERVAL '30 days' for monthly reports
Department Filtering: Add WHERE e.department="Sales" for team-specific reports
Different Metrics: Modify SELECT clause to include product categories or customer segments

Based on your business needs, you can determine next steps: weekly reports work well for operational teams, monthly reports suit strategic planning, quarterly reports serve executive dashboards, and daily reports help with real-time monitoring. The workflow adapts automatically to any SQL structure, allowing you to quickly create multiple reporting pipelines for different stakeholders.

Next Steps

1. Multi-Database Support

Replace the PostgreSQL node with MySQL, SQL Server, or any supported database. The workflow logic remains identical while connecting to different data sources. This flexibility makes the solution valuable across diverse technology stacks.

2. Advanced Scheduling

Modify the Schedule Trigger for different frequencies. Set up daily reports for operational metrics, monthly reports for strategic planning, or quarterly reports for board meetings. Each schedule can target different recipient groups with tailored content.

3. Enhanced Formatting

Extend the Code node to include charts and visualizations using Chart.js, conditional formatting based on performance thresholds, or executive summaries with key insights. The HTML output supports rich formatting and embedded graphics.

4. Multi-Recipient Distribution

Add logic to send different reports to different stakeholders. Sales managers receive individual team reports, executives receive high-level summaries, and finance teams receive revenue-focused metrics. This targeted approach ensures each audience gets relevant information.

Conclusion

This automated SQL reporting workflow demonstrates how n8n bridges the gap between data science expertise and operational efficiency. By combining database integration, scheduling, and email automation, you can eliminate routine reporting tasks while delivering professional results to stakeholders.

The workflow’s modular design makes it particularly valuable for data teams managing multiple reporting requirements. You can duplicate the workflow for different databases, modify the SQL queries for various metrics, and adjust the formatting for different audiences—all without writing custom scripts or managing server infrastructure.

Unlike traditional ETL tools that require extensive configuration, n8n’s visual interface makes complex data workflows accessible to both technical and non-technical team members. Your SQL expertise remains the core value, while n8n handles the automation infrastructure, scheduling reliability, and delivery mechanisms.

Most importantly, this approach scales with your organization’s needs. Start with simple weekly reports, then expand to include data visualizations, multi-database queries, or integration with business intelligence platforms. The foundation you build today becomes the automated reporting infrastructure that supports your team’s growth tomorrow.

Born in India and raised in Japan, Vinod brings a global perspective to data science and machine learning education. He bridges the gap between emerging AI technologies and practical implementation for working professionals. Vinod focuses on creating accessible learning pathways for complex topics like agentic AI, performance optimization, and AI engineering. He focuses on practical machine learning implementations and mentoring the next generation of data professionals through live sessions and personalized guidance.

Source link

Jobs & Careers

Common Sense, Not AI, Could Have Prevented RCB Stampede: K’taka IT Minister

Published

7 hours ago

July 10, 2025

Siddharth Jindal

In the aftermath of the tragic stampede linked to Royal Challengers Bengaluru (RCB) celebrations that left 11 dead, Karnataka IT Minister Priyank Kharge said the situation called for common sense, not artificial intelligence.

“I think common sense would have helped there rather than AI,” said Kharge, addressing the issue of crowd and traffic mismanagement in a video podcast with AIM. Taking accountability, he said, “On that particular day, I think all of us could have done better.”

Kharge said the responsibility was collective and not limited to one entity. He pointed out that BCCI, IPL officials, the government, the home department, the RCB team, and fans could have all performed better in their respective roles.

Responding to whether AI could have helped manage the crowd flow, Kharge admitted the technology might have flagged high-density areas, but the ultimate call needed to be taken by people on the ground. “AI is going to tell us that there’s too much of a crowd and there needs management. The management has to happen physically,” he said.

Calling it an unfortunate incident, Kharge said the government has accepted responsibility and is taking steps to prevent such tragedies from occurring in the future.

“We’ve already owned up to it, and we’re working towards it. Probably we’ll be getting a crowd management bill as well, where the use of such technologies will be advocated,” the minister said. The Central Administrative Tribunal (CAT) has held RCB “prima facie responsible” for the incident, citing inadequate preparation and sudden announcements.

Earlier this week, however, RCB has formally challenged this. The cricketing team argues that the CAT made these comments without giving the team a chance to present its side, violating principles of natural justice.

Source link

Funding & Business1 week ago

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

Jobs & Careers1 week ago

Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding

Mergers & Acquisitions1 week ago

Donald Trump suggests US government review subsidies to Elon Musk’s companies

Funding & Business1 week ago

Rethinking Venture Capital’s Talent Pipeline

Jobs & Careers1 week ago

Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)

Education3 days ago

9 AI Ethics Scenarios (and What School Librarians Would Do)

Education3 days ago

Teachers see online learning as critical for workforce readiness in 2025

Education5 days ago

How ChatGPT is breaking higher education, explained

Education4 days ago

Nursery teachers to get £4,500 to work in disadvantaged areas

Jobs & Careers1 week ago

Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle

aistoriz.com

Google Launches T5Gemma to Reclaim Encoder-Decoder Architecture Benefits

You may like

Leave a Reply Cancel reply

Leave a Reply

Jobs & Careers

Kaggle CLI Cheat Sheet – KDnuggets

1. Installation & Setup

2. Competitions

List Competitions

List Competition Files

Download Competition Files

Submit to a Competition

List Your Submissions

View Leaderboard

3. Datasets

List Datasets

List Files in a Dataset

Download Dataset Files

Initialize Dataset Metadata

Create a New Dataset

Create a New Dataset Version

4. Notebooks

List Kernels

Get Kernel Code

Initialize Kernel Metadata

Update Kernel

Get Kernel Output

Check Kernel Status

5. Models

List Models

Get a Model

Initialize Model Metadata

Create a New Model

Update a Model

Delete a Model

6. Config

View Config

Set Config

Unset Config

7. Tips

Jobs & Careers

Automate SQL Workflows with n8n: Scheduled Database Reports via Email

The Hidden Cost of Routine SQL Reporting

The Solution: A 4-Node Automated Reporting Pipeline

Building the Workflow: Step-by-Step Implementation

Prerequisites

Step 1: Set Up Your PostgreSQL Database

Step 2: Configure Gmail for Automated Sending

Step 3: Import and Configure the Workflow

Step 4: Test and Deploy

Understanding Your Automated Report

Technical Deep Dive: Understanding the Implementation

Testing with Different Scenarios

Next Steps

1. Multi-Database Support

2. Advanced Scheduling

3. Enhanced Formatting

4. Multi-Recipient Distribution

Conclusion

Jobs & Careers

Common Sense, Not AI, Could Have Prevented RCB Stampede: K’taka IT Minister

Trending

Leave a Reply
Cancel reply