Skip to main content

previous disabled Page of 2
and
  1. No Access

    Article

    A survey on interpretable reinforcement learning

    Although deep reinforcement learning has become a promising machine learning approach for sequential decision-making problems, it is still not mature enough for high-stake domains such as autonomous driving or...

    Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang in Machine Learning (2024)

  2. No Access

    Article

    Generalization in Deep RL for TSP Problems via Equivariance and Local Search

    Deep reinforcement learning (RL) has proved to be a competitive heuristic for solving small-sized instances of traveling salesman problems (TSP), but its performance on larger-sized instances is insufficient. ...

    Wenbin Ouyang, Yisen Wang, Paul Weng, Shaochen Han in SN Computer Science (2024)

  3. No Access

    Chapter and Conference Paper

    Fair Deep Reinforcement Learning with Generalized Gini Welfare Functions

    Learning fair policies in reinforcement learning (RL) is important when the RL agent’s actions may impact many users. In this paper, we investigate a generalization of this problem where equity is still desire...

    Guanbao Yu, Umer Siddique, Paul Weng in Autonomous Agents and Multiagent Systems. … (2024)

  4. No Access

    Chapter and Conference Paper

    Improving Subtour Elimination Constraint Generation in Branch-and-Cut Algorithms for the TSP with Machine Learning

    Branch-and-Cut is a widely-used method for solving integer programming problems exactly. In recent years, researchers have been exploring ways to use Machine Learning to improve the decision-making process of ...

    Thi Quynh Trang Vo, Mourad Baiou, Viet Hung Nguyen in Learning and Intelligent Optimization (2023)

  5. No Access

    Chapter and Conference Paper

    Unsupervised Salient Patch Selection for Data-Efficient Reinforcement Learning

    To improve the sample efficiency of vision-based deep reinforcement learning (RL), we propose a novel method, called SPIRL, to automatically extract important patches from input images. Following Masked Auto-Enco...

    Zhaohui Jiang, Paul Weng in Machine Learning and Knowledge Discovery i… (2023)

  6. No Access

    Chapter and Conference Paper

    Safe Distributional Reinforcement Learning

    Safety in reinforcement learning (RL) is a key property in both training and execution in many domains such as autonomous driving or finance. In this paper, we formalize it with a constrained RL formulation in...

    Jianyi Zhang, Paul Weng in Distributed Artificial Intelligence (2022)

  7. No Access

    Chapter and Conference Paper

    Planning with Q-Values in Sparse Reward Reinforcement Learning

    Learning a policy from sparse rewards is a main challenge in reinforcement learning (RL). The best solutions to this challenge have been via sample inefficient model-free RL algorithms. Model-based RL algorith...

    Hejun Lei, Paul Weng, Juan Rojas, Yisheng Guan in Intelligent Robotics and Applications (2022)

  8. No Access

    Chapter

    Reinforcement Learning

    Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e.g., board games, video games or autonomous vehicles. In such problems, an agent face...

    Olivier Buffet, Olivier Pietquin, Paul Weng in A Guided Tour of Artificial Intelligence R… (2020)

  9. No Access

    Chapter

    A Hierarchical Approach Based on the Frank–Wolfe Algorithm and Dantzig–Wolfe Decomposition for Solving Large Economic Dispatch Problems in Smart Grids

    A microgrid is an integrated energy system consisting of distributed energy resources and multiple electrical loads operating as a single, autonomous grid either in parallel to or “islanded” from the existing ...

    Jianyi Zhang, M. Hadi Amini, Paul Weng in Smart Microgrids (2019)

  10. No Access

    Article

    Optimal Threshold Policies for Robust Data Center Control

    With the simultaneous rise of energy costs and demand for cloud computing, efficient control of data centers becomes crucial. In the data center control problem, one needs to plan at every time step how many s...

    Paul Weng, Zeqi Qiu 邱泽麒, John Costanzo in Journal of Shanghai Jiaotong University (S… (2018)

  11. No Access

    Chapter and Conference Paper

    Optimal Threshold Policies for Robust Data Center Control

    With the simultaneous rise of energy costs and demand for cloud computing, efficient control of data centers becomes crucial. In the data center control problem, one needs to plan at every time step how many s...

    Paul Weng, Zeqi Qiu, John Costanzo in AETA 2017 - Recent Advances in Electrical … (2018)

  12. No Access

    Chapter and Conference Paper

    An Efficient Primal-Dual Algorithm for Fair Combinatorial Optimization Problems

    We consider a general class of combinatorial optimization problems including among others allocation, multiple knapsack, matching or travelling salesman problems. The standard version of those problems is the ...

    Viet Hung Nguyen, Paul Weng in Combinatorial Optimization and Applications (2017)

  13. No Access

    Chapter and Conference Paper

    From Preference-Based to Multiobjective Sequential Decision-Making

    In this paper, we present a link between preference-based and multiobjective sequential decision-making. While transforming a multiobjective problem to a preference-based one is quite natural, the other direct...

    Paul Weng in Multi-disciplinary Trends in Artificial Intelligence (2016)

  14. No Access

    Chapter and Conference Paper

    Finding Risk-Averse Shortest Path with Time-Dependent Stochastic Costs

    In this paper, we tackle the problem of risk-averse route planning in a transportation network with time-dependent and stochastic costs. To solve this problem, we propose an adaptation of the A* algorithm that...

    Dajian Li, Paul Weng, Orkun Karabasoglu in Multi-disciplinary Trends in Artificial In… (2016)

  15. No Access

    Chapter and Conference Paper

    Reducing the Number of Queries in Interactive Value Iteration

    To tackle the potentially hard task of defining the reward function in a Markov Decision Process (MDPs), a new approach, called Interactive Value Iteration (IVI) has recently been proposed by Weng and Zanuttin...

    Hugo Gilbert, Olivier Spanjaard, Paolo Viappiani, Paul Weng in Algorithmic Decision Theory (2015)

  16. Article

    Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm

    We introduce a novel approach to preference-based reinforcement learning, namely a preference-based variant of a direct policy search method based on evolutionary optimization. The core of our approach is a pr...

    Róbert Busa-Fekete, Balázs Szörényi, Paul Weng, Weiwei Cheng in Machine Learning (2014)

  17. No Access

    Book and Conference Proceedings

    Multi-disciplinary Trends in Artificial Intelligence

    8th International Workshop, MIWAI 2014, Bangalore, India, December 8-10, 2014. Proceedings

    M. Narasimha Murty, **angjian He in Lecture Notes in Computer Science (2014)

  18. No Access

    Chapter and Conference Paper

    Solving Hidden-Semi-Markov-Mode Markov Decision Problems

    Hidden-Mode Markov Decision Processes (HM-MDPs) were proposed to represent sequential decision-making problems in non-stationary environments that evolve according to a Markov chain. We introduce in this paper...

    Emmanuel Hadoux, Aurélie Beynier, Paul Weng in Scalable Uncertainty Management (2014)

  19. No Access

    Chapter and Conference Paper

    Axiomatic Foundations of Generalized Qualitative Utility

    The aim of this paper is to provide a unifying axiomatic justification for a class of qualitative decision models comprising among others optimistic/pessimistic qualitative utilities, binary possibilistic util...

    Paul Weng in Multi-disciplinary Trends in Artificial Intelligence (2013)

  20. No Access

    Chapter and Conference Paper

    Markov Decision Processes with Functional Rewards

    Markov decision processes (MDP) have become one of the standard models for decision-theoretic planning problems under uncertainty. In its standard form, rewards are assumed to be numerical additive scalars. In...

    Olivier Spanjaard, Paul Weng in Multi-disciplinary Trends in Artificial Intelligence (2013)

previous disabled Page of 2